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Preface 



This is a book on linear algebra and matrix theory. While it is self contained, it will work 
best for those who have already had some exposure to linear algebra. It is also assumed that 
the reader has had calculus. Some optional topics require more analysis than this, however. 

I think that the subject of linear algebra is likely the most significant topic discussed in 
undergraduate mathematics courses. Part of the reason for this is its usefulness in unifying 
so many different topics. Linear algebra is essential in analysis, applied math, and even in 
theoretical mathematics. This is the point of view of this book, more than a presentation 
of linear algebra for its own sake. This is why there are numerous applications, some fairly 
unusual. 

This book features an ugly, elementary, and complete treatment of determinants early 
in the book. Thus it might be considered as Linear algebra done wrong. I have done this 
because of the usefulness of determinants. However, all major topics are also presented in 
an alternative manner which is independent of determinants. 

The book has an introduction to various numerical methods used in linear algebra. 
This is done because of the interesting nature of these methods. The presentation here 
emphasizes the reasons why they work. It does not discuss many important numerical 
considerations necessary to use the methods effectively. These considerations are found in 
numerical analysis texts. 

In the exercises, you may occasionally see f at the beginning. This means you ought to 
have a look at the exercise above it. Some exercises develop a topic sequentially. There are 
also a few exercises which appear more than once in the book. I have done this deliberately 
because I think that these illustrate exceptionally important topics and because some people 
don't read the whole book from start to finish but instead jump in to the middle somewhere. 
There is one on a theorem of Sylvester which appears no fewer than 3 times. Then it is also 
proved in the text. There are multiple proofs of the Cayley Hamilton theorem, some in the 
exercises. Some exercises also are included for the sake of emphasizing something which has 
been done in the preceding chapter. 
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Preliminaries 



1.1 Sets And Set Notation 

A set is just a collection of things called elements. For example {1,2,3,8} would be a set 
consisting of the elements 1,2,3, and 8. To indicate that 3 is an element of {1, 2, 3, 8} , it is 
customary to write 3 e {1, 2, 3, 8} . 9 £ {1, 2, 3, 8} means 9 is not an element of {1, 2, 3, 8} . 
Sometimes a rule specifies a set. For example you could specify a set as all integers larger 
than 2. This would be written as S = {x e Z : x > 2} . This notation says: the set of all 
integers, x, such that x > 2. 

If A and B are sets with the property that every element of A is an element of B, then A is 
a subset of B. For example, {1, 2, 3, 8} is a subset of {1, 2, 3, 4, 5, 8} , in symbols, {1, 2, 3, 8} C 
{1,2,3,4,5,8}. It is sometimes said that "A is contained in B" or even "B contains A". 
The same statement about the two sets may also be written as {1, 2, 3, 4, 5, 8} 3 {1, 2, 3, 8}. 

The union of two sets is the set consisting of everything which is an element of at least 
one of the sets, A or B. As an example of the union of two sets {1, 2, 3, 8} U {3, 4, 7, 8} = 
{1, 2, 3, 4, 7, 8} because these numbers are those which are in at least one of the two sets. In 
!',<'iu'va] 

AU B = {x : x £ A or x e B} . 

Be sure you understand that something which is in both A and B is in the union. It is not 
an exclusive or. 

The intersection of two sets, A and B consists of everything which is in both of the sets. 
Thus {1, 2, 3, 8} n {3, 4, 7, 8} = {3, 8} because 3 and 8 are those elements the two sets have 
in common. In general, 

A n B = {x : x e A and x e B} . 

The symbol [a, b] where a and b are real numbers, denotes the set of real numbers x, 
such that a < x < b and [a, b) denotes the set of real numbers such that a < x < b. (a, b) 
consists of the set of real numbers x such that a < x < b and (a, b] indicates the set of 
numbers x such that a < x < b. [a, oo) means the set of all numbers x such that x > a and 
(—oo, a] means the set of all real numbers which are less than or equal to a. These sorts of 
sets of real numbers are called intervals. The two points a and b are called endpoints of the 
interval. Other intervals such as (— oo, b) are defined by analogy to what was just explained. 
In general, the curved parenthesis indicates the end point it sits next to is not included 
while the square parenthesis indicates this end point is included. The reason that there 
will always be a curved parenthesis next to oo or — oo is that these are not real numbers. 
Therefore, they cannot be included in any set of real numbers. 

A special set which needs to be given a name is the empty set also called the null set, 
denoted by 0. Thus is defined as the set which has no elements in it. Mathematicians like 
to say the empty set is a subset of every set. The reason they say this is that if it were not 
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so, there would have to exist a set A, such that has something in it which is not in A. 
However, has nothing in it and so the least intellectual discomfort is achieved by saying 



0C A. 

If A and B s 

Thus 



: two sets, A \ B denotes the set of things which e 



A\B = {xeA:x(£B}. 
Set notation is used whenever convenient. 

1.2 Functions 

The concept of a function is that of souk 1 hing which gives a unique output for a given input. 

Definition 1.2.1 Consider two sets, D and R along with a rule which assigns a unique 
element of R to every element of D. This rule is called a function and it is denoted by a 
letter such as f. Given x G D, f (x) is the name of the thing in R which results from doing 
f to x. Then D is called the domain of f. In order to specify that D pertains to f, the 
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notation D (/) may be used. The set R is sometimes called the range of f. These days it 
is referred to as the codomain. The set of all elements of R which are of the form f (x) 
for some x € D is therefore, a subset of R. This is sometimes referred to as the image of 
f. When this set equals R, the function f is said to be onto, also surjective. If whenever 
x ^ y it follows f (x) ^ f (y), the function is called one to one. , also injective It is 
common notation to write f : D H> R to denote the situation just described in this definition 
where f is a function defined on a domain D which has values in a codomain R. Sometimes 
you may also see something like D >->■ R to denote the same thing. 

1.3 The Number Line And Algebra Of The Real Num- 
bers 

Next, consider the real numbers, denoted by K, as a line extending infinitely far in both 
directions. In this book, the notation, = indicates something is being defined. Thus the 
integers are defined as 

Z={ 1,0,1,-- •}, 

the natural numbers, 

N= {1,2,---} 

and the rational numbers, defined as the numbers which are the quotient of two integers. 



Q = [ — such that m,neZ,n^o\ 
are each subsets of K as indicated in the following picture. 



1/2' 

As shown in the picture, \ is half way between the number and the number, 1. By 
analogy, you can see where to place all the other rational numbers. It is assumed that K has 
the following algebra properties, listed here as a collection of assertions called axioms. These 
properties will not be proved which is why they are called axioms rather than theorems. In 
general, axioms are statements which are regarded as true. Often these are things which 
are "self evident" either from experience or from some sort of intuition but this does not 
have to be the case. 

Axiom 1.3.1 x + y = y + x, (commutative law for addition) 

Axiom 1.3.2 x + = x, (additive identity). 

Axiom 1.3.3 For each x e R, there exists —x £ R such that x + (—x) = 0, (existence of 
additive inverse). 

Axiom 1.3.4 (x + y) + z = x + (y + z) , {associative law for addition). 

Axiom 1.3.5 xy — yx, (commutative law for multiplication). 

Axiom 1.3.6 (xy) z = x (yz) , (associative law for multiplication) . 
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Axiom 1.3.7 lx = x, (multiplicative identity). 

Axiom 1.3.8 For each x 7^ 0, there exists x^ 1 such that xx^ 1 = 1. (existence of multiplica- 
tive inverse). 

Axiom 1.3.9 x (y + z) — xy + xz. (distributive law). 

These axioms are known as the field axioms and any set (there are many others besides 
R) which has two such operations satisfying the above axioms is called a field. Division and 
subtraction are defined in the usual way by x — y = x + (—y) and x/y = x (y 1 ) ■ 

Here is a little proposition which derives some familiar facts. 

Proposition 1.3.10 and 1 are unique. Also —x is unique and x^ 1 is unique. Further- 
more, Ox = xO = and -x = (-1) x. 

Proof: Suppose 0' is another additive identity. Then 

0' = 0' + = 0. 

Thus is unique. Say 1' is another multiplicative identity. Then 

1 = l'l = 1'. 

Now suppose y acts like the additive inverse of x. Then 

-x = (-x) + = (-x) + (x + y ) = (-x + x)+y = y 

Finally 

Ox = (0 + 0) x = Ox + Ox 

= - (Ox) + 0x = - (Ox) + (Ox + Ox) = (- (Ox) + 0x) + Ox = Ox 

Finally 

x + (-1) x = (1+ (-1)) .x = 0a; = 

and so by uniqueness of the additive inverse, (— 1) x — — x. ■ 

1.4 Ordered fields 

The real numbers K are an example of an ordered field. More generally, here is a definition. 

Definition 1.4.1 Let F be a field. It is an ordered field if there exists an order, < which 
satisfies 

1. For any x ^ //, cither x < y or y < x. 

2. If x < y and either z < w or z = w, then, x + z < y + w. 

3. If0<x,0<y, then xy > 0. 

With this definition, the familiar properties of order can be proved. The following 
proposition lists many of these familiar properties. The relation 'o > V has the same 
meaning as l b < a\ 

Proposition 1.4.2 The following are obtained. 
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If x < y and y < z, then x < z. 

Ifx>0 and y > 0, then x + y > 0. 

Ifx>0, then-x<0. 

If x ^ 0, either x or —x is > 0. 

If x < y, then —x > —y. 

Ifx^O, thenx 2 > 0. 

If0<x<y thenx- 1 >y~ l . 

Proof: First consider 1, called the transitive law. Suppose that x < y and y < z. Then 
from the axioms, x + y < y + z and so, adding — y to both sides, it follows 

x < z 

Next consider 2. Suppose x > and y > 0. Then from 2, 

= + 0<a; + y. 

Next consider 3. It is assumed x > so 

= -x + x > + (-x) = -x 

Now consider 4. If x < 0, then 

= x + (-x)<0+(-x) = -x. 
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Consider the 5. Since x < y, it follows from 2 

= x + (-x) <y+ (-x) 
and so by 4 and Proposition 1.3.10, 

(-l)(y+(-x))<0 
Also from Proposition 1.3.10 (—1) (—x) — — {—x) — x and so 

-y + x < 0. 
Hence 

-y < -»■ 

Consider 6. If x > 0, there is nothing to show. It follows from the definition. If x < 0, 
then by 4, — x > and so by Proposition 1.3.10 and the definition of the order, 

{-xf = (-!)(-!) x 2 >0 
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By this proposition again, (—1) (—1) = — (—1) = 1 and so x 2 > as claimed. Note that 
1 > because it equals l 2 . 

Finally, consider 7. First, if x > then if x~ x < 0, it would follow (—1) x^ 1 > and so 
x (—1) x^ 1 = (—1) 1 = — 1 > 0. However, this would require 

0>1 = 1 2 >0 

from what was just shown. Therefore, x" 1 > 0. Now the assumption implies y+ (—1) x > 
and so multiplying by x" 1 , 

yx- 1 + (-1) xx- 1 = yx- 1 + (-1) > 

Now multiply by y~ l , which by the above satisfies y~ x > 0, to obtain 

x^ 1 + (-l)y -1 > 

and so 

x^>y-\ ■ 

In an ordered field the symbols < and > have the usual meanings. Thus a < b means 
a < b or else a = b, etc. 

1.5 The Complex Numbers 

Just as a real number should be considered as a point on the line, a complex number is 
considered a point in the plane which can be identified in the usual way using the Cartesian 
coordinates of the point. Thus (a, b) identifies a point whose x coordinate is a and whose 
y coordinate is b. In dealing with complex numbers, such a point is written as a + ib and 
multiplication and addition are defined in the most obvious way subject to the convention 
that i 2 = -1. Thus, 

(a + ib) + (c + id) = (a + c) + i(b + d) 



(a + ib) (c + id) = ac + iad + ibc + i 2 bd 

= (ac -bd)+i (be + ad) . 

Every non zero complex number, a + ib, with a 2 + b 2 ^ 0, has a unique multiplicative inverse. 



a + ib a 2 + b 2 a 2 + b 2 a 2 + b 2 ' 
You should prove the following theorem. 

Theorem 1.5.1 The complex numbers with multiplication and addition defined as above 
form a field satisfying all the field axioms listed on Page 13. 

Note that if x + iy is a complex number, it can be written as 

x + iy = \Jx 2 + y 2 
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point on the unit circle and so there exists a unique 9 € [0, 2tt) 

such that this ordered pair equals (cos 9, sin 9) . Letting r = yjx 2 + y 2 , it follows that the 
complex number can be written in the form 

x + iy = r (cos 9 + i sin 9) 

This is called the polar form of the complex number. 

The field of complex numbers is denoted as C. An important construction regarding 
complex numbers is the complex conjugate denoted by a horizontal line above the number. 
It is defined as follows. 

a + ib = a — ib. 



. Algebraically, the following 



What it does is reflect a given complex number across tl 
formula is easy to obtain. 

(a~Tib)(a + ib)=a 2 + b 2 . 

Definition 1.5.2 Define the absolute value of a complex number as follows. 

\a + ib\ = ^fa 2 + b 2 . 

Thus, denoting by z the complex number, z = a + ib, 

\z\ = {zzf\ 

With this definition, it is important to note the following. Be sure to verify this. It is 
not too hard but you need to do it. 



Remark 1.5.3 : Let z = a + ib and w = c + id. Then \z - w\ = J {a - cf + (b - df . Thus 
the distance between the point in the plane determined by the ordered pair, (a, b) and the 
ordered pair {c,d) equals \z — w\ where z and w are as just described. 

For example, consider the distance between (2, 5) and (1, 8) . From the distance formula 
this distance equals y (2 — 1) + (5 — 8) = Vw. On the other hand, letting z — 2 + ib and 
w = l + «8, z-w = I -i3 and so (z-w)(z=w) = (1 - i3) (1 + £3) = 10 so \z-w\ = vTO, 
the same thing obtained with the distance formula. 

Complex numbers, are often written in the so called polar form which is described next. 
Suppose x + iy is a complex number. Then 



Now note that 



is a point on the unit circle. Therefore, there exists a unique angle, 9 G [0, 2ir) such that 
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The polar form of the complex number is then 

r (cos 9 + i sin 9) 

where 9 is this angle just described and r = \J x 2 + y 2 . 

A fundamental identity is the formula of De Moivre which follows. 

Theorem 1.5.4 Let r > be given. Then if n is a positive integer, 

[r (cos t + i sin t)] n = r n (cos nt + i sin nt) . 

Proof: It is clear the formula holds if n = 1. Suppose it is true for n. 

[r (cos t + i sin t)] n+1 = [r (cos t + i sin t)] n [r (cos t + i sin t)] 

which by induction equals 

= r n+1 (cos nt + i sin nt) (cos t + i sin t) 

= r n+1 ((cosntcost - sinntsint) +i (sin nt cost + cos nt sin*)) 

= r™ +1 (cos(n + l)i + *sin(n+l)t) 

by the formulas for the cosine and sine of the sum of two angles. ■ 

Corollary 1.5.5 Let z be a non zero complex number. Then there are always exactly k k th 
roots of z in C 

Proof: Let z = x + iy and let z = \z\ (cost + is'mt) be the polar form of the complex 
number. By De Moivre's theorem, a complex number, 

r (cos a + i sin a) , 

is a k th root of z if and only if 

r k (coska + isinka) = \z\ (cost + isint) . 

This requires r k = \z\ and so r — \z\ ' and also both cos (ka) = cost and sin(fca) = sint. 
This can only happen if 

ka = t + 21tt 

for I an integer. Thus 

and so the k th roots of z are of the form 

„r(„(-*) + ^(y*)). l6t 
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3 the c 






1 periodic of period 2ir, there are exactly k distinct number 



If A and B are two sets, A\B denotes the set of things which are in A but not in B. 
Thus 

A\B = {xeA:xiB}. 

Set notation is used whenever convenient. 

1.2 Functions 

The concept of a function is that of something which gives a unique output for a given input. 

Definition 1.2.1 Consider two sets, D and R along with a rule which assigns a unique 
element of R to every element of D. This rule is called a function and it is denoted by a 
letter such as f. Given x £ D, f (x) is the name of the thing in R which results from doing 
f to x. Then D is called the domain of f. In order to specify that D pertains to f, the 
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Thus the cube roots of i are ^ +i(\) ■, —^ + i (|) , and -i. 

The ability to find k th roots can also be used to factor some polynomials. 

Example 1.5.7 Factor the polynomial x 3 — 27. 

First find the cube roots of 27. By the above procedure using De Moivre's theore 
these cube roots are 3, 3 (=± + i^f\ , and 3 (=± - i^f\ . Therefore, x 3 + 27 = 



*+«*))( 



x 3 - 27 = (x - 3) (x 2 + 3x + 9) 

where the quadratic polynomial, x 2 + 3x + 9 cannot be factored without using complex 
numbers. 

The real and complex numbers both are fields satisfying the axioms on Page 13 and it is 
usually one of these two fields which is used in linear algebra. The numbers are often called 
scalars. However, it turns out that all algebraic notions work for any field and there are 
many others. For this reason, I will often refer to the field of scalars as F although F will 
usually be either the real or complex numbers. If there is any doubt, assume it is the field 
of complex numbers which is meant. The reason the complex numbers are so significant in 
linear algebra is that they are algebraically complete. This means that every polynomial 
Sfe=o a k zk i n > 1) a n 7^ 0, having coefficients au in C has a root in in C. 

Later in the book, proofs of the fundamental theorem of algebra are given. However, here 
is a simple explanation of why you should believe this theorem. The issue is whether there 
exists zeC such that p (z) = for p (z) a polynomial having coefficients in C. Dividing by 
the leading coefficient, we can assume that p (z) is of the form 

p (z) = z n + a n -iz n ~ 1 H h a\z + a , a ^ 0. 

If a = 0, there is nothing to prove. Denote by C r the circle of radius r in the complex plane 
which is centered at 0. Then if r is sufficiently large and \z\ — r, the term z n is far larger 
than the rest of the polynomial. Thus, for r large enough, A r = {p(z) : z £ C r } describes 
a closed curve which misses the inside of some circle having as its center. Now shrink r. 
Eventually, for r small enough, the non constant terms are negligible and so A r is a curve 
which is contained in some circle centered at ao which has in its outside. 




r small 

Thus it is reasonable to believe that for some r during this shrinking process, the set 
A r must hit 0. It follows that p(z) — for some z. This is one of those arguments which 
seems all right until you think about it too much. Nevertheless, it will suffice to see that 
the fundamental theorem of algebra is at least very plausible. A complete proof is in an 
appendix. 
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1.6 Exercises 

1. Let z = 5 + i9. Find z -1 . 

2. Let z = 2 + «7 and let w = 3 - i8. Find zw, z + w,z 2 , and tu/z. 

3. Give the complete solution to x 4 + 16 = 0. 

4. Graph the complex cube roots of 8 in the complex plane. Do the same for the four 
fourth roots of 16. 

5. If z is a complex number, show there exists ui a complex number with \uj\ = 1 and 
uz=\z\. 

6. De Moivre's theorem says [r (cost + zsini)]" = r n (cos nt + i sin nt) for n a positive 
integer. Does this formula continue to hold for all integers, ra, even negative integers? 
Explain. 

7. You already know formulas for cos (x + y) and sin (x + y) and these were used to prove 
De Moivre's theorem. Now using De Moivre's theorem, derive a formula for sin (5x) 
and one for cos (5x). Hint: Use the binomial theorem. 

8. If z and w are two complex numbers and the polar form of z involves the angle 8 while 
the polar form of w involves the angle (j), show that in the polar form for zw the angle 
involved is 9 + <\>. Also, show that in the polar form of a complex number, z, r — \z\. 

9. Factor x 3 + 8 as a product of linear factors. 

10. Write x 3 + 27 in the form (x + 3) (x 2 + ax + b) where x 2 + ax + b cannot be factored 
any more using only real numbers. 

11. Completely factor x 4 + 16 as a product of linear factors. 

12. Factor x 4 + 16 as the product of two quadratic polynomials each of which cannot be 
factored further without using complex numbers. 

13. If 2, w are complex numbers prove zw = zw and then show by induction that z\- ■ ■ z m = 
zi ■ ■ ■ z^. Also verify that YlT=i Zk = YTk=i *k- ^ n words this says the conjugate of a 
product equals the product of the conjugates and the conjugate of a sum equals the 
sum of the conjugates. 

14. Suppose p (x) = a n x" + a n _ix™ _1 + ■ • • + a\x + ao where all the a^ are real numbers. 
Suppose also that p (z) = for some z € C. Show it follows that p (z) = also. 

15. I claim that 1 = — 1. Here is why. 

J(-i) 2 = Vi = 

This is clearly a remarkable result but is there something wrong with it? If so, what 
is wrong? 

16. De Moivre's theorem is really a grand thing. I plan to use it now for rational exponents, 
not just integers. 

1 = l (1/4) = (cos2vr + isin27r) 1/4 = cos (tt/2) + i sin (vr/2) = i. 

Therefore, squaring both sides it follows 1 = —1 as in the previous problem. What 
does this tell you about De Moivre's theorem? Is there a profound difference between 
raising numbers to integer powers and raising numbers to non integer powers? 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 

17. Show that C cannot be considered an ordered field. Hint: Consider i 2 — — 1. Recall 
that 1 > by Proposition 1.4.2. 

18. Say a + ib < x + iy if a < x or if a — x, then b < y. This is called the lexicographic 
order. Show that any two different complex numbers can be compared with this order. 
What goes wrong in terms of the other requirements for an ordered field. 

19. With the order of Problem 18, consider for n € N the complex number 1 — i. Show 
that with the lexicographic order just described, each of 1 — in is an upper bound to 
all these numbers. Therefore, this is a set which is "bounded above" but has no least 
upper bound with respect to the lexicographic order on C. 



1.7 Completeness of R 

Recall the following important definition from calculus, completeness of R. 

Definition 1.7.1 A non empty set, SCR is bounded above (below) if there exists x £ R 
such that x > (<) s for all s 6 S. If S is a nonempty set in E which is bounded above, 
then a number, I which has the property that I is an upper bound and that every other upper 
bound is no smaller than I is called a least upper bound, l.u.b. (S) or often sup (S) . If S is a 
nonempty set bounded below, define the greatest lower bound, g.l.b. (S) or inf (S) similarly. 
Thus g is the g.l.b. (S) means g is a lower bound for S and it is the largest of all lower 
bounds. If S is a nonempty subset of M. which is not bounded above, this information is 
expressed by saying sup (S) = +oo and if S is not bounded below, inf (S) = — oo. 

Every existence theorem in calculus depends on some form of the completeness axiom. 

Axiom 1.7.2 (completeness) Every nonempty set of real numbers which is bounded above 
has a least upper bound and every nonempty set of real numbers which is bounded below has 
a greatest lower bound. 
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It is this axiom which distinguishes Calculus from Algebra. A fundamental result about 
sup and inf is the following. 

Proposition 1.7.3 Let S be a nonempty set and suppose sup (5) exists. Then for every 
5 > 0, 

S n (sup (5) - S, sup (S)] t^0. 

If inf (S) exists, then for every S > 0, 

Sn[inf(S),inf(S) + <&)^0. 

Proof: Consider the first claim. If the indicated set equals 0, then sup (S) — S is an 
upper bound for S which is smaller than sup (5) , contrary to the definition of sup (S) as 
the least upper bound. In the second claim, if the indicated set equals 0, then inf (S) + 5 
would be a lower bound which is larger than inf (S) contrary to the definition of inf (S) . ■ 
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1.8 Well Ordering And Archimedean Property 

Definition 1.8.1 A set is well ordered if every nonempty subset S, contains a smallest 
element z having the property that z < x for all x G S. 

Axiom 1.8.2 Any set of integers larger than a given number is well ordered. 

In particular, the natural numbers defined as 

N={1,2,---} 

is well ordered. 

The above axiom implies the principle of mathematical induction. 

Theorem 1.8.3 (Mathematical induction) A set S C Z, having the property that a £ S 
and n + 1 G S whenever n G S contains all integers x G Z such that x > a. 

Proof: Let T = ([a, oo) flZ)\S. Thus T consists of all integers larger than or equal 
to a which are not in S. The theorem will be proved if T = 0. If T ^ then by the well 
ordering principle, there would have to exist a smallest element of T, denoted as b. It must 
be the case that b > a since by definition, a <£T. Then the integer, b — 1 > a and 6—1^5 
because if b - 1 G S, then b-l + l = beSby the assumed property of S. Therefore, 
b — 1 G ([a, oo)nZ)\S = T which contradicts the choice of b as the smallest element of T. 
(b — 1 is smaller.) Since a contradiction is obtained by assuming T ^ 0, it must be the case 
that T = and this says that everything in [a, oo) n Z is also in S. ■ 

Example 1.8.4 Show that for allneN, \ ■ § • • • ^^ < 775=- 

If n — 1 this reduces to the statement that ^ < ~m which is obviously true. Suppose 
then that the inequality holds for n. Then 

1 3 In - 1 2n + 1 

2 ' 4 2n ' 2n + 2 




The theorem will be proved if this last expression is less than * . This happens if and 
only if 

/ 1 \ 2 _ 1 2n+l 

Vy^mJ ~ 2n + 3 > (2„ + 2) 2 

which occurs if and only if (2n + 2) > (2n + 3) (2rc + 1) and this is clearly true which may 
be seen from expanding both sides. This proves the inequality. 

Definition 1.8.5 The Archimedean property states that whenever ieR, and a > 0, there 
exists n G N smc/i i/iai na > x. 

Proposition 1.8.6 E /ias ifte Archimedean property. 

Proof: Suppose it is not true. Then there exists x G M and a > such that na < x 
for all 71 G N. Let S — {na : n G N} . By assumption, this is bounded above by x. By 
completeness, it has a least upper bound y. By Proposition 1.7.3 there exists n G N such 
that 

y — a < na < y. 
Then y = y — a + a < na + a = (n + 1) a < y, a contradiction. ■ 
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Theorem 1.8.7 Suppose x < y and y — x > 1. Then there exists an integer ieZ, such 
that x < I < y. If x is an integer, there is no integer y satisfying x < y < x + 1. 

Proof: Let x be the smallest positive integer. Not surprisingly, x = 1 but this can be 
proved. If x < 1 then x 2 < x contradicting the assertion that x is the smallest natural 
number. Therefore, 1 is the smallest natural number. This shows there is no integer, y, 
satisfying x < y < x + 1 since otherwise, you could subtract x and conclude < y — x < 1 
for some integer y - x. 

Now suppose y — x > 1 and let 

S={weN:w>y}. 

The set S is nonempty by the Archimedean property. Let k be the smallest element of S. 
Therefore, k — 1 < y. Either k — 1 < x or k — 1 > x. If k — 1 < x, then 



contrary to the assumption that y — x > 1. Therefore, x < k — 1 < y. Let I = k — I. ■ 

It is the next theorem which gives the density of the rational numbers. This means that 
for any real number, there exists a rational number arbitrarily close to it. 

Theorem 1.8.8 If x < y then there exists a rational number r such that x < r < y. 

Proof: Let n e N be large enough that 

n(y-x)>l. 

Thus (y — x) added to itself n times is larger than 1. Therefore, 

n(y - x) = ny + n (-x) = ny - nx > 1. 

It follows from Theorem 1.8.7 there exists raeZ such that 

nx < m < ny 

and so take r = m/n. ■ 

Definition 1.8.9 A set S CM. is dense in E if whenever a < b, S D (a, b) ^ 0. 

Thus the above theorem says Q is "dense" in K. 

Theorem 1.8.10 Suppose < a and let b > 0. Then there exists a unique integer p and 
real number r such that < r < a and b = pa + r. 

Proof: Let S = {n e N : an > b} . By the Archimedean property this set is nonempty. 
Let p + 1 be the smallest element of S. Then pa < b because p + 1 is the smallest in S. 
Therefore, 

r = b - pa > 0. 
If r > a then b — pa > a and so b > (p + 1) a contradicting p+ 1 <E S. Therefore, r < a as 
desired. 

To verify uniqueness of p and r, suppose pi and r.;, i — 1,2, both work and r2 > r±. Then 
a little algebra shows 



Thus pi — p 2 is an integer between and 1, contradicting Theorem 1.8.7. The case that 
ri > r2 cannot occur either by similar reasoning. Thus r\ = ?*2 and it follows that p\ = pi. 
■ 

This theorem is called the Euclidean algorithm when a and b are integers. 
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1.9 Division And Numbers 

First recall Theorem 1.8.10, the Euclidean algorithm. 



Theorem 1.9.1 Suppose < c 
number r such that < r < a a 



,nd let b > 0. The? 
I b = pa + r. 



there exists a unique integer p and real 



The following definition describes what is meant by a prime number and also what is 
meant by the word "divides" . 

Definition 1.9.2 The number, a divides the number, b if in Theorem 1.8.10, r = 0. That 
is there is zero remainder. The notation for this is a\b, read a divides b and a is called a 
factor of b. A prime number is one which has the property that the only numbers which 
divide it are itself and 1. The greatest common divisor of two positive integers, m, n is that 
number, p which has the property that p divides both m and n and also if q divides both m 
and n, then q divides p. Two integers are relatively prime if their greatest common divisor 
is one. The greatest common divisor of m and n is denoted as (m, n) . 

There is a phenomenal and amazing theorem which relates the greatest common divisor 
to the smallest number in a certain set. Suppose m, n are two positive integers. Then if x, y 
are integers, so is xm + yn. Consider all integers which are of this form. Some are positive 
such as lm + In and some are not. The set S in the following theorem consists of exactly 
those integers of this form which are positive. Then the greatest common divisor of m and 
n will be the smallest number in S. This is what the following theorem says. 

Theorem 1.9.3 Let m,n be two positive integers and define 

S = {xm + yne N:x,y£Z}. 

Then the smallest number in S is the greatest common divisor, denoted by (m, n) . 
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Proof: First note that both m and n are in S so it is a nonempty set of positive integers. 
By well ordering, there is a smallest element of S, called p — x m + y n. Either p divides m 
or it does not. If p does not divide m, then by Theorem 1.8.10, 

m = pq + r 

where < r < p. Thus m = {x^m + y n) q + r and so, solving for r, 

r = m (1 - x ) + (-y q) n e S. 

However, this is a contradiction because p was the smallest element of S. Thus p\m. Similarly 
P\n- 

Now suppose q divides both m and n. Then m = qx and n = qy for integers, x and y. 
Therefore, 

p = mx + ny = x a qx + y qy = q (x a x + y y) 

showing q\p. Therefore, p = (m,n) . ■ 
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There is a relatively simple algorithm for finding (to, n) which will be discussed now. 
Suppose < to < n where to, n are integers. Also suppose the greatest common divisor is 
(to, n) — d. Then by the Euclidean algorithm, there exist integers q, r such that 



Now d divides n and to so there are numbers k, I such that dk = m 7 dl = n. From the above 
equation, 

r = n — qm = dl — qdk — d{l — qk) 

Thus d divides both to and r. If k divides both m and r, then from the equation of 1.1 it 
follows k also divides n. Therefore, k divides d by the definition of the greatest common 
divisor. Thus d is the greatest common divisor of to and r but m + r < m + n. This yields 
another pair of positive integers for which d is still the greatest common divisor but the 
sum of these integers is strictly smaller than the sum of the first two. Now you can do the 
same thing to these integers. Eventually the process must end because the sum gets strictly 
smaller each time it is done. It ends when there are not two positive integers produced. 
That is, one is a multiple of the other. At this point, the greatest common divisor is the 
smaller of the two numbers. 

Procedure 1.9.4 To find the greatest common divisor of m,n where < to < n, replace 
the pair {m,n} with {m,r} where n = qm + r for r < m. This new pair of numbers has 
the same greatest common divisor. Do the process to this pair and continue doing this till 
you obtain a pair of numbers where one is a multiple of the other. Then the smaller is the 
sought for greatest common divisor. 

Example 1.9.5 Find the greatest common divisor of 165 and 385. 

Use the Euclidean algorithm to write 

385 = 2 (165) +55 

Thus the next two numbers are 55 and 165. Then 

165 = 3 x 55 

and so the greatest common divisor of the first two numbers is 55. 

Example 1.9.6 Find the greatest common divisor of 1237 and 4322. 

Use the Euclidean algorithm 

4322 = 3 (1237) +611 

Now the two new numbers are 1237,611. Then 

1237 = 2 (611) + 15 

The two new numbers are 15,611. Then 

611 =40 (15) + 11 

The two new numbers are 15,11. Then 

15 = 1(11) +4 
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The two new numbers are 11,4 

2 (4) + 3 
The two new numbers are 4, 3. Then 

4 = 1 (3) + 1 
The two new numbers are 3, 1. Then 

3 = 3x1 

and so 1 is the greatest common divisor. Of course you could see this right away when the 
two new numbers were 15 and 11. Recall the process delivers numbers which have the same 
greatest common divisor. 

This amazing theorem will now be used to prove a fundamental property of prime num- 
bers which leads to the fundamental theorem of arithmetic, the major theorem which says 
every integer can be factored as a product of primes. 

Theorem 1.9.7 If p is a prime and p\ab then either p\a or p\b. 

Proof: Suppose p does not divide a. Then since p is prime, the only factors of p are 1 
and p so follows (p, a) = 1 and therefore, there exists integers, x and y such that 

1 = ax + yp. 

Multiplying this equation by b yields 

b = abx + ybp. 

Since p\ab, ab = pz for some integer z. Therefore, 

b — abx + ybp = pzx + ybp = p (xz + yb) 

and this shows p divides b. ■ 

Theorem 1.9.8 (Fundamental theorem of arithmetic) Let a € N\{1}. Then a = JJ" =1 Pi 
where pi are all prime numbers. Furthermore, this prime factorization is unique except for 
the order of the factors. 

Proof: If a equals a prime number, the prime factorization clearly exists. In particular 
the prime factorization exists for the prime number 2. Assume this theorem is true for all 
a < n— 1. If n is a prime, then it has a prime factorization. On the other hand, if n is not 
a prime, then there exist two integers k and m such that n — km where each of k and m 
are less than n. Therefore, each of these is no larger than n — 1 and consequently, each has 
a prime factorization. Thus so does n. It remains to argue the prime factorization is unique 
except for order of the factors. 

Suppose 

where the pt and qj are all prime, there is no way to reorder the qx such that m — n and 
Pi — qi for all i, and n + m is the smallest positive integer such that this happens. Then 
by Theorem 1.9.7, pi\qj for sonic j. Since these are prime numbers this requires p\ = qj. 
Reordering if necessary it can be assumed that qj = q±. Then dividii ip; both sides in p s tj\ . 



iw=n*+ 
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Since n + m was as small as possible for the theorem to fail, it follows that n — 1 = m — 1 
and the prime numbers, q 2 ,- ■ ■ ,q m can be reordered in such a way that p^ = qt for all 
k = 2, • • • , n. Hence pi = q t for all i because it was already argued that p\ — qi, and this 
results in a contradiction. ■ 

1.10 Systems Of Equations 

Sometimes it is necessary to solve systems of equations. For example the problem could be 
to find x and y such that 

x + y = 7 and 2x - y = 8. (1.2) 

The set of ordered pairs, (x,y) which solve both equations is called the solution set. For 
example, you can see that (5,2) = (x,y) is a solution to the above system. To solve this, 
note that the solution set does not change if any equation is replaced by a non zero multiple 
of itself. It also does not change if one equation is replaced by itself added to a multiple 
of the other equation. For example, x and y solve the above system if and only if x and y 
solve the system 



x + y = 7,2x-y + (-2) (x + y) = 8 + (-2) (7). (1.3) 

The second equation was replaced by —2 times the first equation added to the second. Thus 
the solution is y = 2, from — 3y = —6 and now, knowing y = 2, it follows from the other 
equation that x + 2 = 7 and so x — 5. 

Why exactly does the replacement of one equation with a multiple of another added to 
it not change the solution set? The two equations of 1.2 are of the form 

Ei = fi,E 2 = h (1.4) 

where E\ and E 2 are expressions involving the variables. The claim is that if a is a number, 
then 1.4 has the same solution set as 

E 1 = f 1 ,E 2 +aE 1 = f 2 + af 1 . (1.5) 
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Why is this? 

If (x,y) solves 1.4 then it solves the first equation in 1.5. Also, it satisfies aE x = af± 
and so, since it also solves E 2 — f 2 it must solve the second equation in 1.5. If (x,y) solves 
1.5 then it solves the first equation of 1.4. Also aEi = afi and it is given that the second 
equation of 1.5 is verified. Therefore, E 2 = f 2 and it follows (x,y) is a solution of the second 
equation in 1.4. This shows the solutions to 1.4 and 1.5 are exactly the same which means 
they have the same solution set. Of course the same reasoning applies with no change if 
there are many more variables than two and many more equations than two. It is still the 
case that when one equation is replaced with a multiple of another one added to itself, the 
solution set of the whole system does not change. 

The other thing which does not change the solution set of a system of equations consists 
of listing the equations in a different order. Here is another example. 

Example 1.10.1 Find the solutions to the system, 




Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 



x + 3y + 6z = 25 
2x + 7y + Uz = 58 (1.6) 

2y + 5z = 19 

To solve this system replace the second equation by (—2) times the first equation added 
to the second. This yields, the system 

x + 3y + 6z = 25 

y + 2z = 8 (1.7) 

2y + 5z = 19 

Now take (—2) times the second and add to the third. More precisely, replace the third 
equation with (—2) times the second added to the third. This yields the system 



+ 3y + 6z = 1 
y + 2z = 8 



(l.N) 



At this point, you can tell what the solution is. This system has the same solution as the 
original system and in the above, z = 3. Then using this in the second equation, it follows 
y + 6 = 8 and so y — 2. Now using this in the top equation yields £ + 6 + 18 = 25 and so 
x=l. 

This process is not really much different from what you have always done in solving a 
single equation. For example, suppose you wanted to solve 2x + 5 = 3x — 6. You did the 
same thing to both sides of the equation thus preserving the solution set until you obtained 
an equation which was simple enough to give the answer. In this case, you would add — 2x 
to both sides and then add 6 to both sides. This yields x = 11. 

In 1.8 you could have continued as follows. Add (—2) times the bottom equation to the 
middle and then add (—6) times the bottom to the top. This yields 

x + 'iy = 19 



Now add (—3) times the second to the top. This yields 



y = 6 , 

z = 3 

a system which has the same solution set as the original system. 

It is foolish to write the variables every time you do these operations, 
write the system 1.6 as the following "augmented matrix" 

1 3 6 25 



It has exactly the s 



a y column, 



s the original system but here it is understood there is 



and a z column, I 14 . The rows correspond 
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to the equations in the system. Thus the top row in the augmented matrix corresponds to 
the equation, 

x + 3y + 6z = 25. 

Now when you replace an equation with a multiple of another equation added to itself, you 
are just taking a row of this augmented matrix and replacing it with a multiple of another 
row added to it. Thus the first step in solving 1.6 would be to take (—2) times the first row 
of the augmented matrix above and add it to the second row, 

1 3 6 25 
12 8 



Note how this corresponds to 1.7. Next take (—2) times the second row and add to the 
third, 

' 1 3 6 25 N 

1 2 

1 

which is the same as 1.8. You get the idea I hope. Write the system as an augmented matrix 
and follow the procedure of either switching rows, multiplying a row by a non zero number, 
or replacing a row by a multiple of another row added to it. Each of these operations leaves 
the solution set unchanged. These operations are called row operations. 

Definition 1.10.2 The row operations consist of the following 

1. Switch two rows. 

2. Multiply a row by a nonzero number. 

3. Replace a row by a multiple of another row added to it. 

It is important to observe that any row operation can be "undone" by another inverse 
row operation. For example, if ri,r 2 are two rows, and r 2 is replaced with r' 2 — avi + r 2 
using row operation 3, then you could get back to where you started by replacing the row r' 2 
with —a times i"i and adding to r 2 . In the case of operation 2, you would simply multiply 
the row that was changed by the inverse of the scalar which multiplied it in the first place, 
and in the case of row operation 1, you would just make the same switch again and you 
would be back to where you started. In each case, the row operation which undoes what 
was done is called the inverse row operation. 

Example 1.10.3 Give the complete solution to the system of equations, 5x + 10y — 7z = —2, 

2x + Ay - 3z = -1, and 3x + 6y + hz = 9. 

The augmented matrix for this system is 



10 -7 -2 
6 5 9 

Multiply the second row by 2, the first row by 5, and then take (—1) times the first row and 
add to the second. Then multiply the first row by 1/5. This yields 

2 4-3 

1 

3 6 5 
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Now, combining some row operations, take (—3) times the first row and add this to 2 times 
the last row and replace the last row with this. This yields. 









21 



Putting in the variables, the last two rows say z — 1 and z = 21. This is impossible so 
the last system of equations determined by the above augmented matrix has no solution. 
However, it has the same solution set as the first system of equations. This shows there is no 
solution to the three given equations. When this happens, the system is called inconsistent. 
This should not be surprising that something like this can take place. It can even happen 
for one equation in one variable. Consider for example, x = x+l. There is clearly no solution 
to this. 

Example 1.10.4 Give the complete solution to the system of equations, 3x — y — 5z — 9, 
y - 10 z = 0, and -2x + y= -6. 

The augmented matrix of this system is 

3-1-5 9 
1 -10 
-2 1 -6 , 

Replace the last row with 2 times the top row added to 3 times the bottom row. This gives 



-1 



-10 
-10 



Next take —1 times the middle row and add to the bottom. 

3-1-5 9 
1 -10 


Take the middle row and add to the top and then divide the top r< 



1 
1 




-10 




This says y — lOz and 5 
solution set of this system 



I- 5z. Apparently z can equal any m 
h 5i, y = lOt, and z = t where t i: 



which results by 3. 



nber. Therefore, the 
ompletely arbitrary. 



The system has an infinite set of solutions and this is a good description of the solutions. 
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This is what it is all about, finding the solutions to the system. 

Definition 1.10.5 Since z = t where t is arbitrary, the variable z is called a free variable. 



The phenomenon of an infinite solution set occurs in equations having only one variable 
also. For example, consider the equation x = x. It doesn't matter what x equals. 

Definition 1.10.6 A system of linear equations is a list of equations, 
^a lj x j = fj, i= 1,2,3,- •• ,m 

where aij are numbers, fj is a number, and it is desired to find (x±, ■ ■ ■ ,x n ) solving each of 
the equations listed. 

As illustrated above, such a system of linear equations may have a unique solution, no 
solution, or infinitely many solutions. It turns out these are the only three cases which can 
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occur for linear systems. Furthermore, you do exactly the same things to solve any linear 
system. You write the augmented matrix and do row operations until you get a simpler 
system in which it is possible to see the solution. All is based on the observation that the 
row operations do not change the solution set. You can have more equations than variables, 
fewer equations than variables, etc. It doesn't matter. You always set up the augmented 
matrix and go to work on it. These things are all the same. 

Example 1.10.7 Give the complete solution to the system of equations, — 41x + 15y = 168, 
109x - 40y = -447, -3x + y=12, and 2x + z = -1. 



The augmented matrix is 



-41 15 168 
109 -40 -447 



To solve this multiply the top row by 109, the second row by 41, add the top row to the 
, and multiply the top row by 1/109. Note how this process combined several 



v operations. This yields 



-41 15 168 



Next take 2 times the third row and replace the fourth row by this added to 3 times the 
fourth row. Then take (—41) times the third row and replace the first row by this added to 
3 times the first row. Then switch the third and the first rows. This yields 



-5 -15 



Take —1/2 times the third row and add to the bottom row. Then take 5 times the third 
row and add to four times the second. Finally take 41 times the third row and add to 4 
times the top row. This yields 

2 -1476 




It follows x = ^2 = ~ 3, y — 3 and z = 5. 

You should practice solving systems of equations. Here are some exercises. 

1.11 Exercises 

1. Give the complete solution to the system of equations, 3x — y + 4z — 6, y + 8z = 0, 
and —2a; + y = —4. 

2. Give the complete solution to the system of equations, x + 3y + 3z = 3, 3x + 2y + z = 9, 
and -4a; + z = -9. 
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3. Consider the system —5a; + 2y — z = and — 5x — 2y — z = 0. Both equations equal 
zero and so — 5x + 2y — z = — 5x — 2y — z which is equivalent to y = 0. Thus x and 
z can equal anything. But when x — 1, z — —4, and y = arc plugged in to the 
equations, it doesn't work. Why? 

4. Give the complete solution to the system of equations, x + 2y + 6z = 5,3x + 2y + 6z — 7 

-ix + 5y + 15z = -7. 

5. Give the complete solution to the system of equations 

x + 2y + 3z = 5,3x + 2y + z = 7, 
-Ax + by + z = -7,x + 3z = 5. 

6. Give the complete solution of the system of equations, 

x + 2y + 3z = 5, 3x + 2y + 2z = 7 
-4.x + 5y + bz = -7, X = 5 

7. Give the complete solution of the system of equations 

x + y + 3z = 2, 3x - y + hz = 6 
-Ax + 9y + z = -8, x + 5y + 7z = 2 

8. Determine a such that there are infinitely many solutions and then find them. Next 
determine a such that there are no solutions. Finally determine which values of a 
correspond to a unique solution. The system of equations for the unknown variables 
x,y,z is 

3za 2 -3a + x + y + l = Q 

3x - a - y + z (a 2 + 4) - 5 = 

za 2 - a - 4x + 9y + 9 = 

9. Find the solutions to the following system of equations for x, y, z, w. 

y + z = 2, z + w = 0,y - 4z - 5w = 2,2y + z - w = 4 
10. Find all solutions to the following equations. 

x + y + z = 2, z + w = 0, 
2x + 2y + z-w = 4, x + y - 4z - hz = 2 

1.12 F" 

The notation, C™ refers to the collection of ordered lists of n complex numbers. Since every 
real number is also a complex number, this simply generalizes the usual notion of R™, the 
collection of all ordered lists of n real numbers. In order to avoid worrying about whether 
it is real or complex numbers which are being referred to, the symbol F will be used. If it is 
not clear, always pick C. More generally, F™ refers to the ordered lists of n elements of F n . 

Definition 1.12.1 Define F" = {(xi,--- ,x n ) : xj € F for j = 1, • • • ,n} . (*!,••• ,x n ) = 
(2/i, • • ' ,Vn) if and only if for all j = 1, • • • , n, Xj = y r When (x u ■ ■ ■ ,x n ) € F™, it is 
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conventional to denote (xi,--- ,x n ) by the single bold face letter x. The numbers Xj are 
called the coordinates. The set 

{((),••• ,0,i,0,---,0):teF} 

for t in the i th slot is called the i th coordinate axis. The point = (0, • • • , 0) is called the 
origin. 

Thus (1, 2, Ai) G F 3 and (2, 1, Ai) G F 3 but (1, 2, Ai) ^ (2, 1, Ai) because, even though the 
same numbers are involved, they don't match up. In particular, the first entries are not 



1.13 Algebra in F n 



There are two algebraic operations done with elements of F™ . One is addition and the other 
is multiplication by numbers, called scalars. In the case of C™ the scalars are complex 
numbers while in the case of K" the only allowed scalars are real numbers. Thus, the scalars 
always come from F in either case. 

Definition 1.13.1 If x £ F" and a £ F, also called a scalar, then ax £ F™ is defined by 

ax = a{x\, ■ ■ ■ ,x n ) = (axi, ■ ■ ■ ,ax n ) . (1.9) 

This is known as scalar multiplication. If x, y £ F™ then x + y £ F" and is defined by 

x + y = (jci,--- ,x n ) + (yi,--- ,y n ) 

= (x! + yi ,--- ,x n +y n ) (1.10) 

With this definition, the algebraic properties satisfy the conclusions of the following 
theorem. 

Theorem 1.13.2 For v, w £ F" and a, ft scalars, (real numbers), the following hold. 

v + w = w + v, (1.11) 

the commutative law of addition, 

(v + w)+z = v+(w + z), (1.12) 

the associative law for addition, 

v + = v, (1.13) 

the existence of an additive identity, 

v+(-v) = 0, (1.14) 

the existence of an additive inverse, Also 

tt(v + w)=av+aw, (1-15) 

(a + /3) v =av+/3v, (1.16) 

a03v) = a/3(v), (1.17) 

lv = v. (1.18) 

In the above = (0, • • • ,0). 
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You should verify that these properties all hold. As usual subtraction is defined as 
x — y = x+ (— y) . The conclusions of the above theorem are called the vector space axioms. 

1.14 Exercises 

1. Verify all the properties 1.11-1.18. 

2. Compute 5(1,2 + 3i, 3, -2) + 6 (2 - i, 1, -2, 7) . 

3. Draw a picture of the points in K 2 which are determined by the following ordered 

pairs. 

(a) (1,2) 

(b) (-2,-2) 

(c) (-2,3) 
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(d) (2,-5) 

4. Does it make sense to write (1, 2) + (2, 3, 1)? Explain. 

5. Draw a picture of the points in R 3 which are determined by the following ordered 
triples. If you have trouble drawing this, describe it in words. 

(a) (1,2,0) 

(b) (-2,-2,1) 

(c) (-2,3,-2) 

1.15 The Inner Product In ¥ n 

When F = R or C, there is something called an inner product. In case of R it is also called 
the dot product. This is also often referred to as the scalar product. 

Definition 1.15.1 Let a, b e F n define ah as 

a-b = ^a fc 6 fe . 

;— i 

With this definition, there are several imporl airl properl ies sal isfied by the inner product. 
In the statement of these properties, a and /3 will denote scalars and a, b, c will denote 
vectors or in other words, points in F n . 

Proposition 1.15.2 The inner product satisfies the following properties. 

a-b=bTa" (1.19) 

a • a > and equals zero if and only if a = (1-20) 

(aa + /?b)-c=a(a-c)+/3(b-c) (1.21) 

c (aa + /3b) = a (c • a) +/3 (c • b) (1.22) 

|a| 2 =a.a (1.23) 

You should verify these properties. Also be sure you understand that 1.22 follows from 
the first three and is therefore redundant. It is listed here for the sake of convenience. 

Example 1.15.3 Find (1, 2, 0, -1) • (0, i, 2, 3) . 

This equals + 2 (-i) + + -3 = -3 - 2i 

The Cauchy Schwarz inequality takes the following form in terms of the inner product. 
I will prove it using only the above axioms for the inner product. 

Theorem 1.15.4 The inner product satisfies the inequality 

|a-b|<|a||b|. (1.24) 

Furthermore equality is obtained if and only if one of a or h is a scalar multiple of the other. 
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Proof: First define fleC such that 

0(a-b) = |a-b|,|0|=l, 
and define a function of t e R 

/ (t) = (a + tOb) ■ (a + tOb) . 
Then by 1.20, / (t) > for all t G R. Also from 1.21,1.22,1.19, and 1.23 
/ (i) = a • (a + tOb) + tOb ■ (a + tOb) 

= a • a + tO (a • b) + tO (b • a) + t 2 \6\ 2 b ■ b 

= |a| 2 + 2i Re 6 (a • b) + |b| 2 1 2 = |a| 2 + 2t |a • b| + |b| 2 1 2 

Now if |b| = it must be the case that a • b = because otherwise, you could pick large 
negative values of t and violate / (t) > 0. Therefore, in this case, the Cauchy Schwarz 
inequality holds. In the case that |b| ^ 0, y — f (t) is a polynomial which opens up and 
therefore, if it is always nonnegative, its graph is like that illustrated in the following picture 



Then the quadratic formula requires that 

The disci'imi 



4|a-br-4|ar|br<0 



since otherwise the function, / (£) would have two real zeros and would necessarily have a 
graph which dips below the t axis. This proves 1.24. 

It is clear from the axioms of the inner product that equality holds in 1.24 whenever one 
of the vectors is a scalar multiple of the other. It only remains to verify this is the only way 
equality can occur. If either vector equals zero, then equality is obtained in 1.24 so it can be 
assumed both vectors are non zero. Then if equality is achieved, it follows / (t) has exactly 
one real zero because the discriminant vanishes. Therefore, for some value of t, a + tOb = 
showing that a is a multiple of b. ■ 

You should note that the entire argument was based only on the properties of the inner 
product listed in 1.19 - 1.23. This means that whenever something satisfies these properties, 
the Cauchy Schwartz inequality holds. There are many other instances of these properties 
besides vectors in F n . Also note that 1.24 holds if 1.20 is simplified to a • a > 0. 

The Cauchy Schwartz inequality allows a proof of the triangle inequality for distances 
in F" in much the same way as the triangle inequality for the absolute value. 

Theorem 1.15.5 (Triangle inequality) For a, b e F" 

|a + b| < |a| + |b| (1.25) 

and equality holds if and only if one of the vectors is a nonnegative scalar multiple of the 
other. Also 

||a|-|b||<|a-b| (1.26) 
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Proof: By properties of the inner product and the Cauchy Schwartz inequality, 

|a + b| 2 = (a + b) • (a + b) = (a • a) + (a • b) + (b • a) + (b • b) 

= |a| 2 + 2Re (a • b) + |b| 2 < |a| 2 + 2 |a ■ b| + |b| 2 

<|a| 2 + 2|a||b| + |b| 2 = (|a| + |b|) 2 . 
Taking square roots of both sides you obtain 1.25. 

It remains to consider when equality occurs. If either vector equals zero, then that 
vector equals zero times the other vector and the claim about when equality occurs is 
verified. Therefore, it can be assumed both vectors are nonzero. To get equality in the 
second inequality above, Theorem 1.15.4 implies one of the vectors must be a multiple of 
the other. Say b — aa. Also, to get equality in the first inequality, (a • b) must be a 
nonnegative real number. Thus 

0< (a-b) = (a-aa) = a|a| 2 . 

Therefore, a must be a real number which is nonnegative. 
To get the other form of the triangle inequality, 





a = a b + b 


so 


|a| = |a-b + b| < |a-b| + |b| 


Therefore, 


|a|-|b|<|a-b| 


Similarly, 


|b|-|a|<|b-a| = |a-b|. 



(1.27) 
(1.28) 



a 
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It follows from 1.27 and 1.28 that 1.26 holds. This is because ||a| — |b|| equals the left side 
of either 1.27 or 1.28 and either way, ||a| - |b|| < |a- b| . ■ 

1.16 What Is Linear Algebra? 

The above preliminary considerations form the necessary scaffolding upon which linear al- 
gebra is built. Linear algebra is the study of a certain algebraic structure called a vector 
space described in a special case in Theorem 1.13.2 and in more generality below along with 
special functions known as linear transformations. These linear transformations preserve 
certain algebraic properties. 

A good argument could be made that linear algebra is the most useful subject in all 
of mathematics and that it exceeds even courses like calculus in its significance. It is used 
extensively in applied mathematics and engineering. Continuum mechanics, for example, 
makes use of topics from linear algebra in defining things like the strain and in determining 
appropriate constitutive laws. It is fundamental in the study of statistics. For example, 
principal component analysis is really based on the singular value decomposition discussed 
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in this book. It is also fundamental in pure mathematics areas like number theory, functional 
analysis, geometric measure theory, and differential geometry. Even calculus cannot be 
correctly understood without it. For example, the derivative of a function of many variables 
is an example of a linear transformation, and this is the way it must be understood as soon 
as you consider functions of more than one variable. 



1.17 Exercises 

1. Show that (a • b) = \ [|a + b| 2 - |a - b| 2 ] . 



2. Prove from the axioms of the inner product the parallelogram identity, |a + b| + 

|a-b| 2 = 2|a| 2 + 2|b| 2 . 

3. For a, b £ R n , define a • b = Y^l=i Pk a kbk where j3 k > for each k. Show this satisfies 
the axioms of the inner product. What does the Cauchy Schwarz inequality say in 
this case. 

4. In Problem 3 above, suppose you only know (3 k > 0. Does the Cauchy Schwarz in- 
equality still hold? If so, prove it. 

5. Let /, g be continuous functions and define 

f-9 = J f(t)W)dt 

show this satisfies the axioms of a inner product if you think of continuous functions 
in the place of a vector in F ra . What does the Cauchy Schwarz inequality say in this 

case? 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 



6. Show that if / is a real valued continuous function, 



(J*f(t)dt\ <{b-a) fjitfdt. 
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Linear Transformations 



2.1 Matrices 

You have now solved systems of equations by writing them in terms of an augmented matrix 
and then doing row operations on this augmented matrix. It turns out that such rectangular 
arrays of numbers are important from many other different points of view. Numbers are 
also called scalars. In general, scalars are just elements of some field. However, in the first 
part of this book, the field will typically be either the real numbers or the complex numbers. 
A matrix is a rectangular array of numbers. Several of them are referred to as matrices. 
For example, here is a matrix. 

'1 2 3 4' 



This matrix is a 3 x 4 matrix because there are three rows and four colui 

row is (12 3 4), the second row is (5 2 8 7) and so forth. The first column is I 5 J . The 

W 

convention in dealing with matrices is to always list the rows first and then the columns. 
Also, you can remember the columns are like columns in a Greek temple. They stand up 
right while the rows just lay there like rows made by a tractor in a plowed field. Elements of 
the matrix are identified according to position in the matrix. For example, 8 is in position 
2, 3 because it is in the second row and the third column. You might remember that you 
always list the rows before the columns by using the phrase Rowman Catholic. The symbol, 
(dij) refers to a matrix in which the i denotes the row and the j denotes the column. Using 
this notation on the above matrix, 023 — 8, 032 = —9, a\2 = 2, etc. 

There are various operations which are done on matrices. They can sometimes be added, 
multiplied by a scalar and sometimes multiplied. To illustrate scalar multiplication, consider 
the following example. 

1 2 3 4 \ / 3 6 9 12 



-912/ \ 18 -27 3 

The new matrix is obtained by multiplying every entry of the original matrix by the given 
scalar. If A is an m x n matrix —A is defined to equal (—1) A. 

Two matrices which are the same size can be added. When this is done, the result is the 
matrix which is obtained by adding corresponding entries. Thus 

1 2 \ / -1 4 \ / 
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Two matrices are equal exactly when they are the same size and the corresponding entries 
are identical. Thus 

' N 




N 
, 



because they are different sizes. As noted above, you write (q.,) for the matrix C whose 
ijth en t rv i s Ci - , j n d i n g arithmetic with matrices you must define what happens in terms 
of the aj sometimes called the entries of the matrix or the components of the matrix. 
The above discussion stated for general matrices is given in the following definition. 

Definition 2.1.1 Let A = (a^ ) and B = (bij) be two m x n matrices. Then A + B = C 

C={ ClJ ) 

for Cij = dij + bij. Also if x is a scalar, 

xA=(c ij ) 

where c^ — xaij. The number Aij will typically refer to the ij th entry of the matrix A. The 
zero matrix, denoted by will be the matrix consisting of all zeros. 

Do not be upset by the use of the subscripts, ij. The expression Cij = a^ + bij is just 
saying that you add corresponding entries to get the result of summing two matrices as 
discussed above. 

Note that there are 2x3 zero matrices, 3x4 zero matrices, etc. In fact for every size 
there is a zero matrix. 

With this definition, the following properties are all obvious but you should verify all of 
these properties are valid for A, B, and C, m x n matrices and an m x n zero matrix, 

A + B = B + A, (2.1) 

the commutative law of addition, 

(A + B) + C = A+(B + C), (2.2) 

the associative law for addition, 

A + = A, (2.3) 

the existence of an additive identity, 

A+{-A) = 0, (2.4) 

the existence of an additive inverse. Also, for a, (3 scalars, the following also hold. 

a(A + B) =aA + aB, (2.5) 

{a + P)A = aA + 0A, (2.6) 

a {PA) = a(3 (A) , (2.7) 

\A = A. (2.8) 

The above properties, 2.1 - 2.8 are known as the vector space axioms and the fact that 
the mxn matrices satisfy these axioms is what is meant by saying this set of matrices with 
addition and scalar multiplication as defined above forms a vector space. 
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Definition 2.1.2 Matrices which are n x 1 or 1 x n are especially called vectors and are 
often denoted by a bold letter. Thus 



is an n x 1 matrix also called a column vector while a 1 x n matrix of the form (x\ ■ ■ ■ x n ) 
is referred to as a row vector. 

All the above is fine, but the real reason for considering matrices is that they can be 
multiplied. This is where things quit being banal. 

First consider the problem of multiplying an to x n matrix by an n x 1 column vector. 
Consider the following example 

/ 1 2 3 



'(!)- 



Thus it is what is called a linear combination of the columns. These will be discussed 
more later. Motivated by this example, here is the definition of how to multiply an to x n 
matrix by an n x 1 matrix, (vector) 

Definition 2.1.3 Let A = A t j be an to x n matrix and let v be an n x 1 matrix, 

, A= (ai,--- ,a„) 
where &i is an to x 1 vector. Then Av, written as 
( ai ... a„)( 



s the n 



c 1 column vector which equals the following linear combination of the columns. 



Viai + V2SL2 + ■ 



%«» s Ew 



If the j th column of A is 



then 2. 9 takes the form 



( A n \ 



/ Mi \ 

A 2j 

V A m] ) 
I A 12 \ 

22 
„,2 / 



/ A ln \ 



\A mn J 
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Linear Transformations 



Thus the i th entry of Av is J2j=i AijVj. Note that multiplication by an m x n matrix takes 
an n x 1 matrix, and produces an m x 1 matrix (vector). 

Here is another example. 
Example 2.1.4 Compute 



First of all, this is of the form (3 x 4) (4 x 1) and so the result should be a (3 x 1) . 
Note how the inside numbers cancel. To get the entry in the second row and first and only 
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$>' 



a 2 \Vi + a 2 2V2 + a23"3 + 024«4 

0x1 + 2x2 + 1 x0+(-2) x 1 = 



You should do the rest of the problem and verify 

12 13 
2 1-2 
2 14 1 

With this done, the next task is to multiply a 
Before doing so, the following may be helpful. 



i matrix times a 



If the two middle numbers don't match, you can't multiply the matrices! 

an m x n matrix and let B be an n x p matrix. Then B is of 



Definition 2.1.5 Let A 

the form 



* = (bi, 



(2.10) 
p matrix. For example, 



where b^ is an n x 1 matrix. Then an m x p matrix AB is defined as follows. 

AB = (Ahi,--- ,Ab p ) 
where Ahk is an m x 1 matrix. Hence AB as just defined 
Example 2.1.6 Multiply the following. 

2 

3 1 

1 1 

The first thing you need to check before doing anything else is whether it is possible to 
do the multiplication. The first matrix is a 2 x 3 and the second matrix is a 3 x 3. Therefore, 
is it possible to multiply these matrices. According to the above discussion it should be a 
2x3 matrix of the form 

in Third column \ 



12 1 

2 1 



(iJD U •(!-) ? Hi-) 



You know how to multiply a matrix times a vector and so you do so to obtain each of the 
three columns. Thus 

2 N 

3 1 | = | 

Here is another example. 
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Example 2.1.7 Multiply the following. 



ill (-D 



First check if it is possible. This is of the form (3 x 3) (2 x 3) . The inside numbers do not 
match and so you can't do this multiplication. This means that anything you write will be 
absolute nonsense because it is impossible to multiply these matrices in this order. Aren't 
they the same two matrices considered in the previous example? Yes they are. It is just 
that here they are in a different order. This shows something you must always remember 
about matrix multiplication. 



Matrix n 

uuihIkts! 



| Order Matters! 

ultiplication is not commutative. This is very different than i 



2.1.1 The ij th Entry Of A Product 

It is important to describe matrix multiplication in terms of entries of the matrices. What 
is the ij th entry of AB1 It would be the i th entry of the j th column of AB. Thus it would 
be the i th entry of Abj . Now 



(2.11) 



and from the above definition, the i th entry is 

In terms of pictures of the matrix, you are doing 

/ An A 12 ■■■ A ln \ ( B n B 12 
2i A 22 ■■■ A 2 



V A ml A m2 ■ ■ ■ A mn ) \ B nl L 
Then as explained above, the j th column is of the form 



( A 11 
A 2 i 



A ln \ ( B Xj \ 
B, 



< 1 matrix or column vector which equals 

M11 \ / A 12 \ J .1,,, \ 



\A m2 J 
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The i th entry of this m x 1 matrix is 
AnBu + A l2 B 2l 



V A in B nj = J2 A ikB kj 



This shows the following definition for matrix multiplication in terms of the ij th entries of 
the product harmonizes with Definition 2.1.3. 

This motivates the definition for matrix multiplication which identifies the ij th entries 
of the product. 



Definition 2.1.8 Let A = (A t] ) be 
Then AB is an m x p matrix and 



< n matrix and let B — (Bij) be an nx p matrix. 



(AB) ii =Y J A lk B k] 



(2.12) 



Two matrices, A and B are said to be conformable in a particular order if they car, 
multiplied in that order. Thus if A is an r x s matrix and B is a s x p then A and B 
conformable in the order AB. The above formula for (AB) i - says that it equals the i th 
of A times the j th column of B. 
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Example 2.1.9 Multiply if possible 

First check to see if this is possible. It is of the form (3 x 2) (2 x 3) and since the inside 
numbers match, it must be possible to do this and the result should be a 3 x 3 matrix. The 
answer is of the form 



2 6/ v/ \ 2 6/ v/ \ 2 6 

where the commas separate the columns in the resulting product. Thus the above product 
equals 

' 16 15 
13 15 

46 42 14 
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a 3 x 3 matrix as desired. In terms of the ij th entries and the above definition, the entry in 
the third row and second column of the product should equal 

^ a 3k b k2 = a 31 b 12 + a 32 6 2 2 = 2x3 + 6x6 = 42. 

You should try a few more such examples to verify the above definition in terms of the ij th 
entries works for other entries. 

/ 1 2 \ / 2 3 1 

Example 2.1.10 Multiply if possible 3 17 6 2 

\ 2 6 / \ 

This is not possible because it is of the form (3 x 2) (3 x 3) and the middle numbers 
don't match. 

/ 2 3 1 \ / 1 2 

Example 2.1.11 Multiply if possible 7 6 2 3 1 

\ / \ 2 6 

This is possible because in this case it is of the form (3 x 3) (3 x 2) and the middle 
numbers do match. When the multiplication is done it equals 

13 13 
29 32 


Check this and be sure you come up with the same answer. 

Example 2.1.12 Multiply if possible 2(1 2 1 ). 

In this case you are trying to do (3 x 1) (1 x 4) . The inside numbers match so you can 
do it. Verify 

/ 1 2 1 N 
1(1 2 1 ) = 2 4 2 
\l 2 1 0, 

2.1.2 Digraphs 

Consider the following graph illustrated in the picture. 




There are three locations in this graph, labelled 1,2, and 3. The directed lines represent 
a way of going from one location to another. Thus there is one way to go from location 1 
to location 1. There is one way to go from location 1 to location 3. It is not possible to go 
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from location 2 to location 3 although it is possible to go from location 3 to location 2. Lets 
refer to moving along one of these directed lines as a step. The following 3x3 matrix is 
a numerical way of writing the above graph. This is sometimes called a digraph, short for 
directed graph. 

' 1 1 1 ' 
1 

.110, 

Thus a,ij, the entry in the i th row and j th column represents the number of ways to go from 
location i to location j in one step. 

Problem: Find the number of ways to go from i to j using exactly k steps. 

Denote the answer to the above problem by a k j. We don't know what it is right now 
unless k — 1 when it equals a^- described above. However, if we did know what it was, we 
could find a(° +1 as follows. 

a,k +1 = 2_, a%a r j 

This is because if you go from i to j in k + 1 steps, you first go from i to r in k steps and 
then for each of these ways there are a r j ways to go from there to j. Thus a h r a r j gives 
the number of ways to go from i to j in k + 1 steps such that the k th step leaves you at 
location r. Adding these gives the above sum. Now you recognize this as the ij th entry of 
the product of two matrices. Thus 

and so forth. From the above definition of matrix multiplication, this shows that if A is the 
matrix associated with the directed graph as above, then a*j- is just the ij th entry of A k 
where A k is just what you would think it should be, A multiplied by itself k times. 

Thus in the above example, to find the number of ways of going from 1 to 3 in two steps 
you would take that matrix and multiply it by itself and then take the entry in the first row 
and third column. Thus 

1 1 1 \ 2 / 3 

10 0=111 

110/ \ 2 1 1 t 

and you see there is exactly one way to go from 1 to 3 in two steps. You can easily see this 
is true from looking at the graph also. Note there are three ways to go from 1 to 1 in 2 
steps. Can you find them from the graph? What would you do if you wanted to consider 5 
steps? 

' 1 1 1 \ 5 / 28 19 13 N 

1 = 13 9 6 

110/ \ 19 13 9 

There are 19 ways to go from 1 to 2 in five steps. Do you think you could list them all by 
looking at the graph? I don't think you could do it without wasting a lot of time. 

Of course there is nothing sacred about having only three locations. Everything works 
just as well with any number of locations. In general if you have n locations, you would 
need to use a n x n matrix. 

Example 2.1.13 Consider the following directed graph. 
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Write the matrix which is associated with this directed graph and find the number of ways 
to go from 2 to 4 in three steps. 

Here you need to use a 4x4 matrix. The one you need is 

110' 
10 
110 1 
10 1 

Then to find the answer, you just need to multiply this matrix by itself three times and look 
at the entry in the second row and fourth column. 



110 
10 
110 1 
10 1 



There is exactly one way to go from 2 to 4 in three steps. 

How many ways would there be of going from 2 to 4 in five steps? 



110 
10 
110 1 
10 1 



4 6 3 3 



There are three ways. Note there are 10 ways to go from 3 to 2 in five steps. 

This is an interesting application of the concept of the ij th entry of the product matrices. 
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2.1.3 Properties Of Matrix Multiplication 

As pointed out above, sometimes it is possible to multiply matrices in one order but not 
in the other order. What if it makes sense to multiply them in either order? Will they be 
equal then? 

Example 2.1.14 Compare ( J \ ) ( \ \ ) and ( \ \ ) ( \ \ J . 

The first product is 

(iJ)(!S)-(ii)- 

the second product is 



Qtafford 

1# associates 




bookboon.com/blog/subsites/stafford 
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and you see these are not equal. Therefore, you cannot conclude that AB = BA for matrix 
multiplication. However, there are some properties which do hold. 

Proposition 2.1.15 If all multiplications and additions make sense, the following hold for 
matrices, A,B,C and a, 6 scalars. 

A{aB + bC)=a{AB) + b{AC) (2.13) 

(B + C)A = BA + CA (2.14) 

A(BC) = (AB)C (2.15) 

Proof: Using the above definition of matrix multiplication, 
(AiaB + bC))^ = J2 A ik(aB + bC) kj 

= ^2A ik (aB kj + bC kj ) 

k 
= aY,AikB kj + bJ2AikC kj 

k k 

= a(AB) ij + b(AC) ij 

= {a{AB) + b{AC)) l3 

showing that A (B + C) — AB + AC as claimed. Formula 2.14 is entirely similar. 

Consider 2.15, the associative law of multiplication. Before reading this, review the 
definition of matrix niuli iplieat ion in terms of entries of the matrices. 

iMBC))^ = ^ fc (BC) fe . 
k 

— 2_^ ^ ik Z-^i BkiCij 

k I 

i 

= {{AB)C) ij M 

Another important operation on matrices is that of taking the transpose. The following 
example shows what is meant by this operation, denoted by placing a T as an exponent on 
the matrix. 



(•":"')'■(..- :) 



What happened? The first column became the first row and the second column became 
the second row. Thus the 3x2 matrix became a 2 x 3 matrix. The number 3 was in the 
second row and the first column and it ended up in the first row and second column. This 
motivates the following definition of the transpose of a matrix. 

Definition 2.1.16 Let A be an m x n matrix. Then A T denotes the n x m matrix which 
is defined as follows. 

{A T ) l0 =A 3l 

The transpose of a matrix has the following important property. 
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Lemma 2.1.17 Let A be an m x n matrix and let B be a n x p matrix. Then 

(ABf = B T A T (2.16) 

and if a and /3 are scalars, 

(aA + /3B) T = aA T + j3B T (2.17) 

Proof: From the definition. 

((ABf) = {AB) ]t 

= Y,A jk B ki 

= I> T U^% 

= (B T A T ) l3 

2.17 is left as an exercise. ■ 

Definition 2.1.18 An n x n matrix A is said to be symmetric if A — A T . It is said to be 
skew symmetric if A T = —A. 

Example 2.1.19 Let 

(2 1 3 

A= 1 5 -3 

\ 3 -3 7 

Then A is symmetric. 

Example 2.1.20 Let 

/ 13 

A= -1 2 

\ -3 -2 

Then A is skew symmetric. 

There is a special matrix called / and defined by 

hi = Sii 

where 6ij is the Kronecker symbol defined by 

d » \ Oifz^j 

It is called the identity matrix because it is a multiplicative identity in the following sense. 

Lemma 2.1.21 Suppose A is an m x n matrix and I n is the n x n identity matrix. Then 
AI n = A. If I m is the m x m identity matrix, it also follows that I m A — A. 

Proof: 

(AI n ) i3 = J2^S kj 
k 

= Mi 

and so AI n = A. The other case is left as 
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Definition 2.1.22 An n x n matrix A has an inverse A 1 if and only if there exists a 
matrix, denoted as A^ 1 such that AA^ 1 = A~ x A = I where I = (Sij) for 

§..= ! Hfi = J 

13 ~\ Ozf l ^j 

Such a matrix is called invertible. 

If it acts like an inverse, then it is the inverse. This is the message of the following 
proposition. 

Proposition 2.1.23 Suppose AB = BA = I. Then B = A- 1 . 

Proof: From the definition B is an inverse for A. Could there be another one B'l 

B' = B'l = B' (AB) = (B'A) B = IB = B. 

Thus, the inverse, if it exists, is unique. ■ 




MAY. 

The stuff that makes life worth lrv 
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2.1.4 Finding The Inverse Of A Matrix 

A little later a formula is given for the inverse of a matrix. However, it is not a good way 
to find the inverse for a matrix. There is a much easier way and it is this which is presented 
here. It is also important to note that not all matrices have inverses. 

Example 2.1.24 Let A = I J . Does A have an inverse? 

One might think A would have an inverse because it does not equal zero. However, 

(::)(iMo) 

and if A" 1 existed, this could not happen because you could multiply on the left by the 
inverse A and conclude the vector (-1, 1) T = (0,0) T \ Thus the answer is that A does not 
have an inverse. 

Suppose you want to find B such that AB = I. Let 

B=(b 1 ■■■ b„ ) 

Also the i th column of / is 

e t = ( ••• I ••• ) T 

Thus, if AB = I, bj, the i th column of B must satisfy the equation Ahi = e^. The augmented 
matrix for finding b, is (A|e,) . Thus, by doing row operations till A becomes J, you end up 
with (J|bj) where b^ is the solution to Ab t — e^. Now the same sequence of row operations 
works regardless of the right side of the agumented matrix (^4|ej) and so you can save trouble 
by simply doing the following. 

{m row operations (Jr|B) 

and the i th column of B is b 4 , the solution to Ab t = e. t . Thus AB = I. 

This is the reason for the following simple procedure for finding the inverse of a matrix. 
This procedure is called the Gauss Jordan procedure. It produces the inverse if the matrix 
has one. Actually, it produces the right inverse. 

Procedure 2.1.25 Suppose A is an n x n matrix. To find A^ 1 if it exists, form the 
augmented n x 2n matrix, 

(A\I) 
and then do row operations until you obtain an n x 2n matrix of the form 

(I\B) (2.18) 

if possible. When this has been done, B — A~ x . The matrix A has an inverse exactly when 
it is possible to do row operations and end up with one like 2.18. 

As described above, the following is a description of what you have just done. 
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where those Ri sympolize row operations. It follows that you could undo what you 
doing the inverse of these row operations in the opposite order. Thus 

R7 1 —R~ 1 ,R~ X 

I x A' 1 " A 

B -4 q I 

Here R~ x is the row operation which undoes the row operation R. Therefore, if yoi 
(B\I) and do the inverse of the row operations which produced / from A i 
order, you would obtain (I\A) . By the same reasoning above, it follows that A is a right 
inverse of B and so BA = I also. It follows from Proposition 2.1.23 that B — A^ 1 . Thus 
the procedure produces the inverse whenever it works. 

If it is possible to do row operations and end up with A r ° w op J^ a lon8 j ) then the above 
argument shows that A has an inverse. Conversely, if A has an inverse, can it be found by 
the above procedure? In this case there exists a unique solution x to the equation An = y. 
In fact it is just x = ix = A~ 1 y. Thus in terms of augmented matrices, you would expect 
to obtain 

(A\y) -> (ilA-'y) 

That is, you would expect to be able to do row operations to A and end up with I. 

The details will be explained fully when a more careful discussion is given which is based 
on more fundamental considerations. For now, it suffices to observe that whenever the above 
procedure works, it finds the inverse. 

1 
Example 2.1.5 

Form the augmented matrix 

110 
-1 1 10 

1 -10 1 

Now do row operations until the nxn matrix on the left becomes the identity matrix. This 
yields after some computations, 

1 \ 
10 1-1 
11-| 

s the matrix on the right, 

\ \ 

1 -1 

1 — 2 ~ 2 

Checking the answer is easy. Just multiply the matrices and see if it works. 

10 

= | 1 

1 

Always check your answer because if you are like some of us, you will usually have made a 
mistake. 
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Example 2.1.27 Let A = ( 1 2 

Set up the augmented matrix (A\I) 

12 2 10 
10 2 10 
3 1-10 1 

Next take (—1) times the first row and add to the second followed by (—3) times the first 
row added to the last. This yields 

12 2 10 
0-2 -110 
0-5-7-301 

Then take 5 times the second row and add to —2 times the last r 

2 2 10 
-10 0-550 
14 1 5 -2 

Next take the last row and add to (—7) times the top row. This yields 

-7 -14 -6 5-2 
-10 -5 5 



Now take (—7/5) times the second row and add to the top. 

-7 1-2-2 
-10 -5 5 
14 1 5 -2 
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Finally divide the top row by —7, the second row by -10 and the bottom row by 14 which 
yields 

' 1 -i | 



10 i -A 

14 14 



1 i A 



Therefore, the inverse is 



2 2 



i i 

14 14 



1 2 2 N 
Example 2.1.28 Let A = ( 10 2 | . Find A~ l . 

2 2 4, 




Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations Linear Transformations 

Write the augmented matrix (A\I) 

2 2 10 
2 10 
2 4 1 

and proceed to do row operations attempting to obtain (J|A _1 ) . Take (—1) times the top 
row and add to the second. Then take (—2) times the top row and add to the bottom. 



'12 2 10 
0-20-11 
0-20-20 


' 

I) 


row to the bottom ro 




12 2 10 
0-2 0-1 1 
0-1-1 





1 



At this point, you can see there will be no inverse because you have obtained a row of zeros 
in the left half of the augmented matrix (A\I) . Thus there will be no way to obtain / on 
the left. In other words, the three systems of equations you must solve to find the inverse 
have no solution. In particular, there is no solution for the first column of A~ x which must 
solve 



because a sequence of row operations leads to the impossible equation, Ox + Oy + Oz = — 1. 

2.2 Exercises 

1. In 2.1 - 2.8 describe -A and 0. 

2. Let A be an n x n matrix. Show A equals the sum of a symmetric and a skew symmetric 
matrix. 

3. Show every skew symmetric matrix has all zeros down the main diagonal. The main 
diagonal consists of every entry of the matrix which is of the form an. It runs from 
the upper left down to the lower right. 

4. Using only the properties 2.1 - 2.8 show —A is unique. 

5. Using only the properties 2.1 - 2.8 show is unique. 

6. Using only the properties 2.1 - 2.8 show 0A = 0. Here the on the left is the scalar 
and the on the right is the zero for m x n matrices. 

7. Using only the properties 2.1 - 2.8 and previous problems show (—1) A = —A. 

8. Prove 2.17. 

9. Prove that I m A = A where A is an m x n matrix. 

10. Let A and be a real m x n matrix and let x e M. n and y e K m . Show (Ax, y) Rm = 
(x,A T y) R7l where (•, -) Rk denotes the dot product in R k . 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations Linear Transformations 



11. Use the result of Problem 10 to verify directly that (AB) = B T A T without making 
any reference to subscripts. 

12. Let x = (-1, -1, 1) and y = (0, 1, 2) . Find x T y and xy T if possible. 

13. Give an example of matrices, A, B, C such that B ^ C, A ^ 0, and yet AB = AC. 

14. Let A = I -2 -1 ) , £ = ( J i ~X ) ' and C = ( _1 2 ° I ' Find 
if possible the following products. AB, BA, AC, CA, CB, BC. 

15. Consider the following digraph. 




Write the matrix associated with this digraph and find the number of ways to go from 
3 to 4 in three steps. 

16. Show that if A~ x exists for an n x n matrix, then it is unique. That is, if BA = I and 
AB = I, then B = A' 1 . 

17. Show {ABy 1 = B~ l A~ l . 

18. Show that if A is an invertible n x n matrix, then so is A T and (A T ) = (A^ 1 ) . 

19. Show that if A is an n x n invertible matrix and x is a n x 1 matrix such that Ax = b 
for b an n x 1 matrix, then x = j4 _1 b. 

20. Give an example of a matrix A such that A 2 = I and yet A^= I and A ^ —I. 

21. Give an example of matrices, A,B such that neither A nor B equals zero and yet 
AB = 0. 



xi - x 2 + 2x 3 
2x 3 + Xi 

3-r.i 
3^4 + 3X2 + x\ 



in the form A where A is an appropriate matrix. 



23. Give another example other than the one given in this section of two square matrices, 
A and B such that AB ^ BA. 

24. Suppose A and B are square matrices of the same size. Which of the following are 
correct? 

(a) (A - Bf =A 2 - 2AB + B 2 

(b) (AB) 2 = A 2 B 2 

(c) (A + B) 2 = A 2 + 2AB + B 2 

(d) (A + Bf =A 2 +AB + BA + B 2 
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(e) A 2 B 2 = A(AB)B 

(f) (A + Bf = A 3 + 2,A 2 B + 3AB 2 + B 3 

(g) (A + B){A-B)=A 2 -B 2 
(h) None of the above. They are all wrong. 

(i) All of the above. They are all right. 

, . Find all 2 x 2 matrices, B such that AB = 0. 

26. Prove that if A^ 1 exists and Ax = then x = 0. 

27. Let 
'12 3 

A= | 2 1 4 

10 2 

Find A" 1 if possible. If A -1 does not exist, determine why. 

28. Let 
'10 3 

A= | 2 3 4 
10 2 

Find A -1 if possible. If A -1 does not exist, determine why. 

29. Let 

/ 1 2 3 
A= 2 1 4 

\ 4 5 10 

Find A -1 if possible. If A' 1 does not exist, determine why. 

30. Let 
'12 2 

112 
2 1-32 
1 2 1 2 , 

Find A" 1 if possible. If A~ x does not exist, determine why. 
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2.3 Linear Transformations 

By 2.13, if A is an m x n matrix, then for v, u vectors in F™ and a, b scalars, 
A Uu + 6v = aAu + bAv € F m 



(2.19) 



• F m is called a linear transformation if for all 



Definition 2.3.1 A function, A : F™ 
u, v e F" and a, b scalars, 2.19 holds. 

From 2.19, matrix multiplication defines a linear transformation as just denned. It 
turns out this is the only type of linear transformation available. Thus if A is a linear 
transformation from F™ to F" 1 , there is always a matrix which produces A. Before showing 
this, here is a simple definition. 
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Definition 2.3.2 A vector, e, ; G F™ is defined as follows: 
( ° \ 



where the 1 is in the i th position and there are zeros everywhere else. Thus 
e; = ((),••■ ,0,1,0,--- ,0) T . 

Of course the e, for a particular value of i in ¥ n would be different than the e$ for that 
same value of i in F™ for m ^ n. One of them is longer than the other. However, which one 
is meant will be determined by the context in which they occur. 

These vectors have a significant property. 



Lemma 2.3.3 Let veF". Thus v is a list of numbers arranged vertically, % 



% . Then 

(2.20) 



Also, if A is an m x n matrix, then letting e$ G F m and ej G F n , 

ef Aej = Aij 

Proof: First note that ef is a 1 x n matrix and v is an n 
multiplication in 2.20 makes perfect sense. It equals 

/ vi \ 



(2.21) 
x 1 matrix so the above 



as claimed. 

Consider 2.21. From the definition of matrix multiplication, and noting that (&j) k = Skj 



efAej = ef 



( E k Ai k (e,) k \ 



( A U \ 

A K 



by the first part of the lemma. ■ 

Theorem 2.3.4 Let L : F™ — > F m be a linear transformation. Then there exists a unique 
m x n matrix A such that 

Ax = Lx 

for all x G F". The ik th entry of this matrix is given by 

efLe k (2.22) 

Stated in another way, the k th column of A equals Le^ . 
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Proof: By the lemma, 



jLx. = eJxkLek = (ef Le^) Xk- 



Let An- = efLek , to prove the existence part of the theorem. 

To verify uniqueness, suppose -Bx = Ax = Lx for all x £ F™. Then in particular, this is 
true for x = e-j and then multiply on the left by ef to obtain 

B i:j = efBe 3 = ejAe 3 = A rj 



is completely determined by the 



Corollary 2.3.5 A linear transformation, L : ¥' 
vectors {Lei, ■ ■ • > Le n } . 

Proof: This follows immediately from the above theorem. The unique matrix determin- 
ing the linear transformation which is given in 2.22 depends only on these vectors. ■ 

This theorem shows that any linear transformation defined on F" can always be con- 
sidered as a matrix. Therefore, the terms "linear transformation" and "matrix" are often 
used interchangeably. For example, to say that a matrix is one to one, means the linear 
transformation determined by the matrix is one to one. 

Example 2.3.6 Find the linear transformation, L : R 2 — > R 2 which has the property that 
Lei = ( ) and Le 2 = ( „ J . From the above theorem and corollary, this linear trans- 
formation is that determined by matrix multiplication by the matrix 

Definition 2.3.7 Let L : ¥ n — » F m be a linear transformation and let its matrix be the 
m x n matrix A. Then ker (L) = {x G F™ : Lx = 0} . Sometimes people also write this as 
N (A) , the null space of A. 

Then there is a fundamental result in the case where m < n. In this case, the matrix A 
of the linear transformation looks like the following. 



< n matrix where m < n. Then N (A) contains nonzero 



Theorem 2.3.8 Let A be c 
vectors. 

Proof: First consider the case where A is a 1 x n matrix for n > 1. Say 

A = ( ai ■ ■ ■ a n ) 

If ai = 0, consider the vector x — e x . If a^ ^ 0, let 
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where b is chosen to satisfy the equation 

aib + 22 a k = 

Suppose now that the theorem is true for any m x n matrix with n > m and consider an 
(m x 1) x n matrix A where n > m + 1. If the first column of A is 0, then you could let 
x = ei as above. If the first column is not the zero vector, then by doing row operations, 
the equation Ax = can be reduced to the equivalent system 



Aix= 



where A\ is of the form 



where B is an m x (n — 1) matrix. Since n > m + 1, it follows that (ra — 1) > m and so 
by induction, there exists a nonzero vector y € F ra_1 such that By = 0. Then consider the 
vector 



Aix has for its top entry the expression b + a T y. Letting B = I : ] , the i th entry of 
Aix for i > 1 is of the form bf y = 0. Thus if b is chosen to satisfy the equation 6 + a T y = 0, 
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2.4 Subspaces And Spans 



Definition 2.4.1 Let {xi, • • • , x p } be vectors in F n . A linear combination is any expression 
of the form 



£«* 



where the a are scalars. The set of all linear combinations of these vectors is called 
span(xi,--- ,x„). If V C F™, then V is called a subspace if whenever a, [3 are scalars 
and u and v are vectors of V, it follows au + /3v G V . That is, it is "closed under the 
algebraic operations of vector addition and scalar multiplication" . A linear combination 
of vectors is said to be trivial if all the scalars in the linear combination equal zero. A set 
of vectors is said to be linearly independent if the only linear combination of these vectors 
which equals the zero vector is the trivial linear combination. Thus {xi, • • • ,x n } is called 
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linearly independent if whenever 

it follows that all the scalars Ck equal zero. A set of vectors, {xi, • • • , x p } , is called linearly 
dependent if it is not linearly independent. Thus the set of vectors is linearly dependent if 
there exist scalars a, i = 1, • • • ,n, not all zero such that Y^k=i c fe x fc = 0. 

Proposition 2.4.2 Let V C F". Then V is a subspace if and only if it is a vector space 
itself with respect to the same operations of scalar multiplication and vector addition. 

Proof: Suppose first that V is a subspace. All algebraic properties involving scalar 
multiplication and vector addition hold for V because these things hold for F™. Is e V? Yes 
it is. This is because Ov e V and Ov = 0. By assumption, for a a scalar and v € V, av € V. 
Therefore, — v = (— l)v G V. Thus V has the additive identity and additive inverse. By 
assumption, V is closed with respect to the two operations. Thus V is a vector space. If 
V C F" is a vector space, then by definition, if a, (3 are scalars and u, v vectors in V, it 
follows that av + (3ueV. ■ 

Thus, from the above, subspaces of F™ are just subsets of F™ which are themselves vector 
spaces. 

Lemma 2.4.3 A set of vectors {x l5 ■ ■ • ,x p } is linearly independent if and only if none of 
the vectors can be obtained as a linear combination of the others. 

Proof: Suppose first that {xi, ■ • • ,x p } is linearly independent. If x^ = X^fc c i x ii then 
= lx fe +^(-c J )x J , 

a nontrivial linear combination, contrary to assumption. This shows that if the set is linearly 
independent, then none of the vectors is a linear combination of the others. 

Now suppose no vector is a linear combination of the others. Is {xi, • • • ,x p } linearly 
independent? If it is not, there exist scalars a, not all zero such that 

]f>X,;=0. 

Say c k ^ 0. Then you can solve for x fe as 

o¥=k 

contrary to assumption. ■ 

The following is called the exchange theorem. 

Theorem 2.4.4 (Exchange Theorem) Let {x 1; • • • ,x r } be a linearly independent set of vec- 
tors such that each x* is in span(yi, ■ ■ ■ , y s ) . Then r < s. 

Proof 1: Suppose not. Then r > s. By assumption, there exist scalars aji such that 
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The matrix whose ji entry is a^ has more columns than rows. Therefore, by Theorem 
2.3.8 there exists a nonzero vector b G F r such that Ah = 0. Thus 



= 2_, a jibii eacn 3- 



£to = £*£<** = £(£«*fc)yj = 



contradicting the assumption that {xi, • • • , x r } is linearly independent. 

Proof 2: Define spanjyi, • • • ,y s } = V, it follows there exist scalars ci, • • • ,c s such 
that 

xi=^>y,. (2.23) 

Not all of these scalars can equal zero because if this were the case, it would follow that 
Xi = and so {xi, • ■ ■ ,x r } would not be linearly independent. Indeed, if Xi = 0, lxi + 
Y^i=2 x i = x ! = and so there would exist a nontrivial linear combination of the vectors 
{xi, • • • , x r } which equals zero. 

Say Cfc 7^ 0. Then solve (2.23) for y^ and obtain 



Define {zi, • • • ,z s _!} by 

{zi,--- ,z s _i} = {yi,--- ,yfe-i,yfc+i,--- ,Ys} 

Therefore, span{xi,zi, • ■ ■ ,z s _i} = V because if v £ V, there exist constants c\, ■ ■ ■ ,c s 
such that 

8-1 

v = ^c,z, +c s y fc . 

Now replace the y^ in the above with a linear combination of the vectors, {xi,Zi, • • • ,z 5 _i} 
to obtain v € span{xi,zi, • • • ,z s _i} . The vector y^, in the list {yi, • ■ ■ , y s } , has now been 
replaced with the vector xi and the resulting modified list of vectors has the same span as 
the original list of vectors, {yi, • • • , y s } . 

Now suppose that r > s and that spanjx^ • • • , x;,zi, • • • ,z p } = V where the vectors, 
zi, • • • , z p are each taken from the set, {yi, • • • , y s } and l+p = s. This has now been done 
for I — 1 above. Then since r > s, it follows that I < s < r and so I + 1 < r. Therefore, x; + i 
is a vector not in the list, {xi, • • • ,x;} and since spanjxi, • • • , x/,Zi, • • • ,z p } = V, there 
exist scalars a and dj such that 

l p 

i=l j=l 

Now not all the dj can equal zero because if this were so, it would follow that {xi, • • • , x r } 
would be a linearly dependent set because one of the vectors would equal a linear combination 
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of the others. Therefore, (2.24) can be solved for one of the z*, say Zfc, in terms of x; + i and 
the other Zj and just as in the above argument, replace that z, with x i+1 to obtain 



Continue this way, eventually obtaining 

spanjxi, • ■ ■ ,x s } = V. 

But then x r G spanjxi, • • • , x s } contrary to the assumption that {xi, • • • ,x r } is linearly 
independent. Therefore, r < s as claimed. 

Proof 3: Suppose r > s. Let Z& denote a vector of {yi, • • • , y s } • Thus there exists j as 
small as possible such that 

span(yi,- ■■ ,y s ) = span(xi,-- ■ ,x m ,zi,-- ■ ,Zj) 

where m + j = s. It is given that m = 0, corresponding to no vectors of {xi, • • • , x m } and 
j — s, corresponding to all the y^ results in the above equation holding. If j > then m < s 
and so 



► E* 



Not all the b t can equal and so you can solve for one of them in terms of x m+1 , x m , ■ • • , Xi , 
and the other z&. Therefore, there exists 

{zi,--- ,Zj_i} C {yi,--- ,y s } 

such that 

span(yi,--- ,y s ) = span(xi,--- ,x m+ i,zi,--- ,Zj-\) 

contradicting the choice of j. Hence j = and 

span (yi, • • • ,y s ) = span (xi, • • • ,x s ) 

It follows that 

x s+ i S span (xi, ■ ■ • , x s ) 

contrary to the assumption the Xfc are linearly independent. Therefore, r<sas claimed. ■ 

Definition 2.4.5 A finite set of vectors, {xi, • ■ • , x r } is a basis for ¥ n i/span (xi, ■ • • , x,.) = 
F ra and {xi, • • ■ ,x r } is linearly independent. 

Corollary 2.4.6 Let {xi, • • • ,x r } and {yi, • • • ,y s } be two bases 1 o/F™. Then r = s = n. 

^^This is the plural form of basis. We could say basiss but it would involve an inordinate amount of 
hissing as in "The sixth shiek's sixth sheep is sick". This is the reason that bases is used instead of basiss. 
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Proof: From the exchange theorem, r < s and s < r. Now note the vectors, 
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e-j = ((),•■• ,0,1,0- •• ,0) 

for i — 1, 2, • • • ,n are a basis for F". ■ 

Lemma 2.4.7 Let {vi, • • ■ , v r } be a set of vectors. Then V = span (vi, • ■ ■ , v r ) is a sub- 
space. 

Proof: Suppose a, j3 are two scalars and let YHk=i c k v k and Y^k=i ^fc v fc are two elements 
of V. What about 
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^ w + ^4v t ? 
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so the answer is yes. ■ 

Definition 2.4.8 A finite set of vectors, {xi, • • • ,x r } is a basis for a sub space V of¥ n if 
span (xi, • • • , x r ) — V and {xi, • • • , x r } is linearly independent. 

Corollary 2.4.9 Let {xi, • ■ ■ , x r } and {yi, • • ■ , y s } be two bases for V. Then r — s. 

Proof: From the exchange theorem, r < s and s < r. ■ 

Definition 2.4.10 Let V be a subspace of¥ n . Then dim (V) read as the dimension of V 
is the number of vectors in a basis. 

Of course you should wonder right now whether an arbitrary subspace even has a basis. 
In fact it does and this is in the next theorem. First, here is an interesting lemma. 

Lemma 2.4.11 Suppose v ^ span(ui,--- , u^) and {ui,--- ,u^} is linearly independent. 
Then {ui, • • • , Ufc,v} is also linearly independent. 

Proof: Suppose J2i=i c * u i + dv = 0. It is required to verify that each a — and 
that d — 0. But if d ^ 0, then you can solve for v as a linear combination of the vectors, 
{ui,--- ,u fc }, 

contrary to assumption. Therefore, d = 0. But then ^2 i=1 Cjiij = and the linear indepen- 
dence of {ui, • • • , Ufc} implies each a — also. ■ 

Theorem 2.4.12 Let V be a nonzero subspace of¥ n . Then V has a basis. 

Proof: Let vi e V where vi ^ 0. If spanjvi} = V, stop, {vi} is a basis for V. 
Otherwise, there exists v 2 S V which is not in spanjvi} . By Lemma 2.4.11 {vi, v 2 } is a 
linearly independent set of vectors. If span{vi,v 2 } = V stop, {vi, v 2 } is a basis for V. If 
span{vi,v 2 } ^ V, then there exists v 3 ^ span{vi,v 2 } and {v 1; v 2 ,v 3 } is a larger linearly 
independent set of vectors. Continuing this way, the process must stop before n + 1 steps 
because if not, it would be possible to obtain n + 1 linearly independent vectors contrary to 
the exchange theorem. ■ 

In words the following corollary states that any linearly independent set of vectors can 
be enlarged to form a basis. 

Corollary 2.4.13 Let V be a subspace of¥ n and let {vi, • • • , v r } be a linearly independent 
set of vectors in V. Then either it is a basis for V or there exist vectors, v r+ i, • • • , v s such 
that {vi, • ■ ■ , v r , v r+1 , ■ ■ • , v s } is a basis for V. 

Proof: This follows immediately from the proof of Theorem 2.4.12. You do exactly the 
same argument except you start with {vi, • • • , v r } rather than {vi}. ■ 

It is also true that any spanning set of vectors can be restricted to obtain a basis. 

Theorem 2.4.14 Let V be a subspace of ¥ n and suppose span(ui--- , u p ) = V where 
the Ui are nonzero vectors. Then there exist vectors {vi • • • , v r } such that {vi • • • , v r } C 
{ui • • • , Up} and {vi • • • , v r } is a basis for V. 

Proof: Let r be the smallest positive integer with the property that for some set 
{vi---,v r }C{u 1 .-. ) u p }, 

span(vi • • • , v r ) = V. 
Then r < p and it must be the case that {vi • • • , v r } is linearly independent because if it 
were not so, one of the vectors, say v^ would be a linear combination of the others. But 
then you could delete this vector from {vi ■ • • , v,.} and the resulting list of r - 1 vectors 
would still span V contrary to the definition of r. ■ 
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2.5 An Application To Matrices 

The following is a theorem of major significance. 

Theorem 2.5.1 Suppose A is an n x n matrix. Then A is one to one (injective) if and 
only if A is onto (surjective). Also, if B is an n x n matrix and AB = I, then it follows 
BA = I. 

Proof: First suppose A is one to one. Consider the vectors, {Aei, ■ ■ ■ , Ae n } where e^ 
is the column vector which is all zeros except for a 1 in the k th position. This set of vectors 
is linearly independent because if 

^2 c kAe k = 0, 



£ 



'•/,«/, 



3 A is one to one, it follows 



which implies each c k = because the e^ are clearly linearly independent. 

Therefore, {Ae\, ■ ■ ■ ,Ae n } must be a basis for F™ because if not there would exist a 
vector, y ^ span(Aei, • • • ,Ae n ) and then by Lemma 2.4.11, {Ae\, ■ ■ ■ ,Ae n ,y} would be 
an independent set of vectors having n + 1 vectors in it, contrary to the exchange theorem. 
It follows that for yeF™ there exist constants, a such that 

y = X! °k Ae k = A I ^ c k e k 
fc=i \fc=i 

showing that, since y was arbitrary, A is onto. 

Next suppose A is onto. This means the span of the columns of A equals F™. If these 
columns are not linearly independent, then by Lemma 2.4.3 on Page 74, one of the columns 
is a linear combination of the others and so the span of the columns of A equals the span of 
the Ti — 1 other columns. This violates the exchange theorem because {ei, • • • , e n } would be 
a linearly independent set of vectors contained in the span of only n — 1 vectors. Therefore, 
the columns of A must be independent and this is equivalent to saying that Ax. = if and 
only if x = 0. This implies A is one to one because if Ax = Ay, then A (x - y) = and so 
x-y = 0. 

Now suppose AB = I. Why is BA = II Since AB = I it follows B is one to one since 
otherwise, there would exist, x ^ such that Bx = and then ABx = A0 =0 / Ix. 
Therefore, from what was just shown, B is also onto. In addition to this, A must be one 
to one because if Ay = 0, then y = Bx for some x and then x — ABx = Ay = showing 
y = 0. Now from what is given to be so, it follows (AB) A = A and so using the associative 
law for matrix multiplication, 

A(BA) - A = A(BA- I) = 0. 

But this means (BA — I) x = for all x since otherwise, A would not be one to one. Hence 
BA = I as claimed. ■ 
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This theorem shows that if an n x n matrix B acts like an inverse when multiplied on 
one side of A, it follows that B = A _1 and it will act like an inverse on both sides of A. 
The conclusion of this theorem pertains to square matrices only. For example, let 



A= 1 ,B= f 1 1 (2.25) 



"(iJ) 



2.6 Matrices And Calculus 

The study of moving coordinate systems gives a non trivial example of the usefulness of the 
ideas involving linear transformations and matrices. To begin with, here is the concept of 
the product rule extended to matrix multiplication. 

Definition 2.6.1 Let A(t) be an m x n matrix. Say A(t) — (Aij (t)) . Suppose also that 
A l3 (t) is a differentiable function for all i,j. Then define A' (t) = (A' VJ (t)) . That is, A' (t) 
is the matrix which consists of replacing each entry by its derivative. Such anmxn matrix 
in which the entries are differentiable functions is called a differentiable matrix. 

The next lemma is just a version of the product rule. 

Lemma 2.6.2 Let A (t) be an m x n matrix and let B (t) be an n x p matrix with the 
property that all the entries of these matrices are differentiable functions. Then 

(A(t)B(t))' = A'(t)B(t) + A(t)B'(t). 

Proof: This is like the usual proof. 



-A(t)B(t)) 



and now, using the fact that the entries of the matrices are all differentiable, one can pass 
to a limit in both sides as h — > and conclude that 

(A (t) B (t))' = A' (t) B{t) + A (t) B' (t) M 
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2.6.1 The Coriolis Acceleration 

Imagine a point on the surface of the earth. Now consider unit vectors, one pointing So\ 
one pointing East and one pointing directly away from the center of the earth. 



Are you remarkable? 



Win one of the six full 
tuition scholarships for 
International MBA or 
MSc in Management 



REGISTER 
NOW 

www.Nyenrode 
V M asterChallen g e.com 



3NYENR0DE 

Tp* BUSINESS UNIVERSITEIT 

Download free eBooks at bookboon.com 



NYENRODE^ 

ER CHALLENGE 

2013— ™ews.„, 



Linear Algebra I Matrices and Row operations Linear Transformations 



■T 



Denote the first as i, the second as j, and the third as k. If you are standing on the earth 
you will consider these vectors as fixed, but of course they are not. As the earth turns, they 
change direction and so each is in reality a function of t. Nevertheless, it is with respect 
to these apparently fixed vectors that you wish to understand acceleration, velocities, and 
displacements. 

In general, let i*, j*, k* be the usual fixed vectors in space and let i (£) , j (£) , k (t) be an 
orthonormal basis of vectors for each t, like the vectors described in the first paragraph. 
It is assumed these vectors are C 1 functions of t. Letting the positive x axis extend in the 
direction of i (t) , the positive y axis extend in the direction of j (£), and the positive z axis 
extend in the direction of k (£) , yields a moving coordinate system. Now let u be a vector 
and let t be some reference time. For example you could let i = 0. Then define the 
components of u with respect to these vectors, i, j, k at time to as 

u = u 1 i(i )+7i 2 j(t )+M 3 k(to). 

Let u (i) be defined as the vector which has the same components with respect to i, j, k but 
at time t. Thus 

u(t)s« 1 i(t) + « 2 j(t) + u 3 k(t). 

and the vector has changed although the components have not. 

This is exactly the situation in the case of the apparently fixed basis vectors on the earth 
if u is a position vector from the given spot on the earth's surface to a point regarded as 
fixed with the earth due to its keeping the same coordinates relative to the coordinate axes 
which are fixed with the earth. Now define a linear transformation Q (t) mapping R 3 to R 3 
by 

Q(<)u = « 1 i(i) + u a j(i) + u 3 k(t) 

where 

u = uH(t )+u 2 }(t )+u 3 k(t ) 

Thus letting v be a vector defined in the same manner as u and a, (3, scalars, 

Q (t) (an + /3v) = (au 1 + pv 1 ) i (t) + (cm 2 + f3v 2 ) j (t) + (cm 3 + /3w 3 ) k (t) 

= (auH (t) + au 2 j (t) + cm 3 k (*)) + (0vH (t) + /3i, 2 j (t) + (3v 3 k («)) 
= a (uH (t) + u 2 i (t) + u 3 k (t)) + (3 (vH (t) + v 2 j (t) + v 3 k (£)) 
= aQ(t)u + 0Q(t)v 
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showing that Q it) is a linear transformation. Also, Q (t) preserves all distances because, 
since the vectors, i (£) , j (£) , k (t) form an orthonormal set, 



3(t)u|=(j>') ! 



Lemma 2.6.3 Suppose Q (t) is a real, differentiable nxn matrix which preserves distances. 
Then Q (t) Q (tf = Q (tf Q (t) = I. Also, ifu(t) = Q (t) u, then there exists a vector, ft (t) 
such that 

u'(t) = n(t)xu(t). 

The symbol x refers to the cross product. 

Proof: Recall that (z • w) = \ (|z + w| 2 - |z - w| 2 ) . Therefore, 

(Q(t)u-Q(t)w) = i(|Q(t)(u + w)| 2 -|Q(t)(u-w)| 2 ) 

- i(|u + w, 2 -,u-w, 2 ) 
= (u-w). 

This implies 

(Q(t) T Q(t)u.w)=(u-w) 

for all u, w. Therefore, Q (tf Q (t ) u = u and so Q (i) T Q (i) = Q (t) Q {tf = /. This proves 
the first part of the lemma. 

It follows from the product rule, Lemma 2.G.2 that 

Q'(t)Q(tf + Q(t)Q'(tf = 

and so 

Q'(t)Q(t) T = -(Q'(t)Q(t) T ) T . (2.26) 

From the definition, Q (t) u = u (£) , 



u'(t) = Q'(t)u=Q'(t)Q(tyn(t). 

Then writing the matrix of Q' (t) Q (t) with respect to fixed in space orthonormal basis 
vectors, i*,j*,k*, where these are the usual basis vectors for ]R 3 , it follows from 2.26 that 
the matrix of Q' (t) Q (t) is of the form 

-U 3 (t) UJ2 (t) 

u 3 (t) -«i (t) 

-« a (t) wi (t) 

3 time dependent scalars Wj. Therefore, 

-« 8 (t) w 2 (*) 

(t)=| w 3 (t) - Wl (t) 

-« a (*) wi (t) 
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where the v 

Therefore. 



where 
because 



3 the components of the vector u (t) in terms of the fixed vectors i*, j*, k*. 
u' (t) = il (t) x u (t) = Q' (t) Q (tf u (t) (2.27) 

n (t) = wi (t) i*+co 2 (t) j*+W3 (t) k*. 



n(t)xu(i) = 



i* j* k* 

Wl W 2 W3 



i* (w 2 w 3 - w 3 u 2 ) + j* (W3U 1 - wf) + k* ( Wl u 2 - w 2 u l ) . 
This proves the lemma and yields the existence part of the following theorem. ■ 

Theorem 2.6.4 Let i (t) , j (£) , k (£) 6e as described. Then there exists a unique vector il (t) 
such that if u (t) is a vector whose components are constant with respect to i (£) , j (t) , k (£) , 
then 

u'(t) = n(t)xu(t). 

Proof: It only remains to prove uniqueness. Suppose f2i also works. Then u(t) — Q (t) u 
and so u' (t) = Q' (t) u and 



for all u. Therefore, 



Q 1 {t)u = flxQ{t)u = n 1 xQ (t) u 



(n-«i) x Q(t)u = 



for all u and since Q (£) is one to one and onto, this implies (ft — Jli) > 
thus O fii = 0. ■ 

Now let R (t) be a position vector and let 

r(t)=R(t)+r B (t) 



- for all w and 



r B (t) = x(t)i(t)+y(t)i(t)+z(t)k(t). 

r B (t) 
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In the example of the earth, R (£) is the position vector of a point p (t) on the earth's 
surface and r B (t) is the position vector of another point from p (£) , thus regarding p (t) 
as the origin, r B (£) is the position vector of a point as perceived by the observer on the 
earth with respect to the vectors he thinks of as fixed. Similarly, v B (t) and as (£) will be 
the velocity and acceleration relative to i (t) , j (£) , k (£), and so v^ = x'i + y'j + z'k and 
a B = x"i + y"j + z"k. Then 

v = r' = R' + x'i + y'i + z'k+xi' + yj' + zk' . 

By , 2.27, if e £ {i, j, k} , e' == tt x e because the components of these vectors with respect 
to i,j,k are constant. Therefore, 

x ' 1 ' + yj' + z'k' = xQ, x i + yfi x j + zfl x k 

= nx(xi + yj + zk) 

and consequently, 

v = R + x'i + y'j + z'k + CI x r B = R + x'i + y'j + z'k + Ox (xi + yj + zk) . 
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Now consider the acceleration. Quantities which are relative to the moving coordinate 
system and quantities which are relative to a fixed coordinate system are distinguished by 
using the subscript B on those relative to the moving coordinate system. 



a = v' = R" + x"i + y"] + z"k+x'i' + y'j' + z'k' + fi'xr B 
+flx x'i + y'j + z'k+xi' + y]' + zk! 

= R" + a B + W x r B + 2fl x v B + Ox (« x r B ) . 
The acceleration a B is that perceived by an observer who is moving with the moving coor- 
dinate system and for whom the moving coordinate system is fixed. The term 17 x (fl x r^) 
is called the centripetal acceleration. Solving for a B , 

a B = a - R" - fl' x r B - 2fl x v B - fix (fl x r B ) . (2.28) 

Here the term — (fix (fl x r B )) is called the centrifugal acceleration, it being an acceleration 
felt by the observer relative to the moving coordinate system which he regards as fixed, and 
the term —2fl x v B is called the Coriolis acceleration, an acceleration experienced by the 
observer as he moves relative to the moving coordinate system. The mass multiplied by the 
Coriolis acceleration defines the Coriolis force. 

There is a ride found in some amusement parks in which the victims stand next to 
a circular wall covered with a carpet or some rough material. Then the whole circular 
room begins to revolve faster and faster. At some point, the bottom drops out and the 
victims are held in place by friction. The force they feel is called centrifugal force and it 
causes centrifugal acceleration. It is not necessary to move relative to coordinates fixed with 
the revolving wall in order to feel this force and it is pretty predictable. However, if the 
nauseated victim moves relative to the rotating wall, he will feel the effects of the Coriolis 
force and this force is really strange. The difference between these forces is that the Coriolis 
force is caused by movement relative to the moving coordinate system and the centrifugal 
force is not. 

2.6.2 The Coriolis Acceleration On The Rotating Earth 

Now consider the earth. Let i*,j*,k*, be the usual basis vectors fixed in space with k* 
pointing in the direction of the north pole from the center of the earth and let i, j, k be the 
unit vectors described earlier with i pointing South, j pointing East, and k pointing away 
from the center of the earth at some point of the rotating earth's surface p. Letting R (t) be 
the position vector of the point p, from the center of the earth, observe the coordinates of 
R (t) are constant with respect to i (t) , j (t) , k (t) . Also, since the earth rotates from West 
to East and the speed of a point on the surface of the earth relative to an observer fixed in 
space is w |R| sin 4> where uj is the angular speed of the earth about an axis through the poles 
and (j) is the polar angle measured from the positive z axis down as in spherical coordinates. 
It follows from the geometric definition of the cross product that 

R' =wk*xR 

Therefore, the vector of Theorem 2.6.4 is 17 = cok* and so 



R" = ft' x R + ft x R' = fix (ft x R) 
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since ft does not depend on t. Formula 2.28 implies 

a B = a - ftx (ft x R) - 2ft x v B - ftx (ft x r B ) . (2.29) 

In this formula, you can totally ignore the term ftx (ft x r B ) because it is so small when- 
ever you are considering motion near some point on the earth's surface. To see this, note 

seconds in a day 

lo (24) (3600) = 2tt, and so u = 7.2722 x 10~ 5 in radians per second. If you are using 
seconds to measure time and feet to measure distance, this term is therefore, no larger than 



(7.2722 x 10" 5 ) |r B |. 

Clearly this is not worth considering in the presence of the acceleration due to gravity which 
is approximately 32 feet per second squared near the surface of the earth. 
If the acceleration a is due to gravity, then 

a B = a - ftx (ft x R) - 2ft x v B = 



- - ftx (ft x R) - 2ft x v B = g - 2ft x v B . 



ftx (ft x R) = (ft ■ R) ft- |ft| 2 R 

and so g, the acceleration relative to the moving coordinate system on the earth is not 
directed exactly toward the center of the earth except at the poles and at the equator, 
although the components of acceleration which are in other directions are very small when 
compared with the acceleration due to the force of gravity and are often neglected. There- 
fore, if the only force acting on an object is due to gravity, the following formula describes 
the acceleration relative to a coordinate system moving with the earth's surface. 

a B = g-2(ft x v B ) 

While the vector ft is quite small, if the relative velocity, v B is large, the Coriolis acceleration 
could be significant. This is described in terms of the vectors i (t) , j (t) , k (t) next. 

Letting (p, 9, (j>) be the usual spherical coordinates of the point p (t) on the surface 
taken with respect to i*, j*, k* the usual way with cj> the polar angle, it follows the i*, j*, k* 
coordinates of this point are 

p sin ((p) cos (9) 

psxo.((j}) sin(0) 

,9 COS ((/)) 

It follows, 

i = cos (</>) cos (9) i* + cos (cj)) sin (6) j* — sin (<J)) k* 
j = -sin(0)i* + cos(0)j* + Ok* 
and 

k = sin (0) cos (9) i* + sin (0) sin (9) j* + cos (0) k* . 
It is necessary to obtain k* in terms of the vectors, i,j,k. Thus the following equation 
needs to be solved for a, b, c to find k* = ai+bj+ck 



■(d) - 


-sin(0) : 


3 in(0)cos(#) 


(9) 


cos (9) , 


3in(0)sin(0) 


•) 





cos(0) 
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The first column is i, the second is j and the third is '. 
is a = — sin {(j)) , b — 0, and c = cos {(j)) . 

Now the Coriolis acceleration on the earth equals 



2(« x v B ) = 2uj 



n(0)i+Oj+o 



i the above matrix. The solution 



< (x'i+y'j+z'k) . 



2uj [(-y' cos <f>) i+ (x' cos + z' sin 0) j - (j/ si 



(2.31) 



Remember (/> is fixed and pertains to the fixed point, p (t) on the earth's surface. Therefore, 
if the acceleration a is due to gravity, 

a B = g— 2uj [(— y' cos <p) i+ (a;' cos </> + z' sin (/>) j — (y' sin 0) k] 

where g = - g | ^ ( ^ + '" 3 b) - Ox (fl x R) as explained above. The term fix (H x R) is pretty 
small and so it will be neglected. However, the Coriolis force will not be neglected. 

Example 2.6.5 Suppose a rock is dropped from a tall building. Where will it strike? 



SIMPLY CLEVER 
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Assume a = — gk and the j component of ag is approximately 

— 2uj (x cos <f> + z' sin </>) . 

The dominant term in this expression is clearly the second one because x' will be small. 
Also, the i and k contributions will be very small. Therefore, the following equation is 
descriptive of the situation. 

slb = — gk— 2 z'oj sin 4>j. 

z' = —gt approximately. Therefore, considering the j component, this is 

2gtu sin 0. 

Two integrations give (wgi 3 /3) sin</> for the j component of the relative displacement at 

This shows the rock does not fall directly towards the center of the earth as expected 
but slightly to the east. 
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Example 2.6.6 In 1851 Foucault set a pendulum vibrating and observed the earth rotate 
out from under it. It was a very long pendulum with a heavy weight at the end so that it 
would vibrate for a long time without stopping 2 . This is what allowed him to observe the 
earth rotate out from under it. Clearly such a pendulum will take 24 hours for the plane of 
vibration to appear to make one complete revolution at the north pole. It is also reasonable 
to expect that no such observed rotation would take place on the equator. Is it possible to 
predict what will take place at various latitudes? 

Using 2.31, in 2.29, 

a B =a-fix(fixR) 

-2w [{-V 1 cos 4>) i+ (x' cos + z' sin </>) j - (y' sin 0) k] . 
Neglecting the small term, Q x (ft x R) , this becomes 

= — gk + T/m—2cu [(—y' cos 0) i+ (V cos cf) + z' sin cf>) j — (y 1 sin <p) k] 

where T, the tension in the string of the pendulum, is directed towards the point at which 
the pendulum is supported, and m is the mass of the pendulum bob. The pendulum can be 
thought of as the position vector from (0, 0, I) to the surface of the sphere x 2 +y 2 + (z — l) 2 = 
I 2 . Therefore, 

T = -Tji-T^j+T l -^k 
and consequently, the diH'civiilm] equations of relative motion are 

x" = -T— r +2cjy'cos</> 
ml 

z' sin (j>) 

z" = T g + 2coy' sine/). 

If the vibrations of the pendulum are small so that for practical purposes, z" = z = 0, the 
last equation may be solved for T to get 





gm - 2uy' 


Therefore 


:, the first two equations become 




x" = — (gm — 2ujm. 


and 





- 2uj (x cos 4> + z sin 0) . 

All terms of the form xy' or y'y can be neglected because it is assumed x and y remain 
small. Also, the pendulum is assumed to be long with a heavy weight so that x' and y' are 

i1m> in ill \Yi 1 1 tin lij unption thi equal ions of motion become 



y" + g- = -2ujx'cos(/). 

2 There is such a pendulum in the Eyring building at BYU and to keep people from touching it, there is 
a little sign which says Warning! 1000 ohms. 
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These equations are of the form 

x" + a 2 x = by', y" + a 2 y = -bx' (2.32) 

where a 2 — | and b — 2a; cos </>. Then it is fairly tedious but routine to verify that for each 
constant, c, 



fbt\ . (Vb 2 +Aa\\ (bt\ . (y/ 

1 V 2" J sm 1 2 J lV = ccos V 2" J sm [ ~ 



yields a solution to 2.32 along with the initial conditions, 



x (0) = 0, y (0) = 0, x 1 (0) = 0, y' (0) = -?—^ . (2.34) 

It is clear from experiments with the pendulum that the earth does indeed rotate out from 
under it causing the plane of vibration of the pendulum to appear to rotate. The purpose 
of this discussion is not to establish these self evident facts but to predict how long it takes 
for the plane of vibration to make one revolution. Therefore, there will be some instant in 
time at which the pendulum will be vibrating in a plane determined by k and j. (Recall 
k points away from the center of the earth and j points East. ) At this instant in time, 
defined as t = 0, the conditions of 2.34 will hold for some value of c and so the solution to 
2.32 having these initial conditions will be those of 2.33 by uniqueness of the initial value 
problem. Writing these solutions differently, 

( x{t)\_ ( sinm \-(Vb^+-< 
[y(t) )- C v.cos(f) ) Sm [ 2 

( sin(^) \ 
This is very interesting! The vector, cl iu\ I always has magnitude equal to |c| 

V cos (.27 / 
but its direction changes very slowly because b is very small. The plane of vibration is 

determined by this vector and the vector k. The term sin ( ^~ 4a t) changes relatively fast 
and takes values between —1 and 1. This is what describes the actual observed vibrations 
of the pendulum. Thus the plane of vibration will have made one complete revolution when 
t = T for 



Therefore, the time it takes for the earth to turn out from under the pendulum is 



Since w is the angular speed of the rotating earth, it follows u — |j = ^ in radians per 
hour. Therefore, the above formula implies 

T = 24sec0. 

I think this is really amazing. You could actually determine latitude, not by taking readings 
with instruments using the North Star but by doing an experiment with a big pendulum. 
You would set it vibrating, observe T in hours, and then solve the above equation for </>. 
Also note the pendulum would not appear to change its plane of vibration at the equator 
because lim0_ J . 7r / 2 sec ^ = 00. 

The Coriolis acceleration is also responsible for the phenomenon of the next example. 
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Example 2.6.7 It is known that low pressure areas rotate counterclockwise as seen from 
above in the Northern hemisphere but clockwise in the Southern hemisphere. Why? 

Neglect accelerations other than the Coriolis acceleration and the following acceleration 
which comes from an assumption that the point p (t) is the location of the lowest pressure. 

a= -a(r B )r B 

where r B = r will denote the distance from the fixed point p (t) on the earth's surface which 
is also the lowest pressure point. Of course the situation could be more complicated but 
this will suffice to explain the above question. Then the acceleration observed by a person 
on the earth relative to the apparently fixed vectors, i, k, j, is 



a B : 



-a (r B ) (xi+yj+zk) - 2u [-y' cos (0) i+ (x' © 



Therefore, one obtains sc 
the components. These a 



c differential equatio 



)+z' sin (</>)) j-(y'sm(cf>)k)] 
■ = x"\ + y"j + z"k by matching 



x" + a(r B )x = 2uiy'cos(j) 

y" + a(r B )y = — 2ojx' cos cf> — 2ljz sin (ct>) 

z" + a(r B ) z = 2ujy' $,va.(f) 

Now remember, the vectors, i, j, k are fixed relative to the earth and so are constant vectors. 
Therefore, from the properties of the determinant and the above differential equations, 



(r' L! 



r B ) = 



.'/ 



a(r B )x + 2u;y' cos -a{r B )y- 2ujx' cos (j) - 2ujz' sin (0) -a(r B )z + 2ujy' si 
x y z 

Then the k th component of this cross product equals 



,s(0)(y 2 



f 2ujxz' s 



K0). 



The first term will be negative because it is assumed p (t) is the location of low pressure 
causing y 2 +x 2 to be a decreasing function. If it is assumed there is not a substantial motion 
in the k direction, so that z is fairly constant and the last term can be neglected, then the 
k th component of (r' B x r B ) is negative provided £ (0, | ) and positive if € (f , 7r). 
Beginning with a point at rest, this implies r' B x r B = initially and then the above implies 
its k th component is negative in the upper hemisphere when (j) < k/2 and positive in the 
lower hemisphere when <p > 7r /2- Using the right hand and the geometric definition of the 
cross product, this shows clockwise rotation in the lower hemisphere and counter clockwise 
rotation in the upper hemisphere. 

Note also that as gets close to ir/2 near the equator, the above reasoning tends to 
break down because cos (</>) becomes close to zero. Therefore, the motion towards the low 
pressure has to be more pronounced in comparison with the motion in the k direction in 
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order to draw this conclus 



2.7 Exercises 

1. Show the map T : W 1 — > M. m defined by T (x) = Ax where A is an m x n matrix and 
x is an m x 1 column vector is a linear transformation. 

2. Find the matrix for the linear transformation which rotates every vector in K 2 through 
an angle of 7r/3. 

3. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of w/4. 

4. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of — 7r/3. 

5. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of 2tt/3. 
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6. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of tt/12. Hint: Note that tt/12 = vr/3 - tt/4. 

7. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of 27r/3 and then reflects across the x axis. 

8. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of tt/3 and then reflects across the x axis. 

9. Find the matrix for the linear transformation which rotates every vector in K 2 through 
an angle of tt/4 and then reflects across the x axis. 

10. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of 7r/6 and then reflects across the x axis followed by a reflection across the 



11. Find the matrix for the linear transformation which reflects every vector in K 2 across 
the x axis and then rotates every vector through an angle of 7r/4. 

12. Find the matrix for the linear transformation which rotates every vector in R 2 through 
an angle of 7r/4 and next reflects every vector across the x axis. Compare with the 
above problem. 

Find the matrix for the linear transformation which reflects every vector in K 2 across 
the x axis and then rotates every vector through an angle of 7r/6. 

Find the matrix for the linear transformation which reflects every vector in K 2 across 
the y axis and then rotates every vector through an angle of n/Q. 



R 2 through 



Find the matrix for the linear transformation which rotates every vector i 
an angle of 5tt/12. Hint: Note that 5tt/12 = 2tt/3 - tt/4. 

Find the matrix for proj u (v) where u = (1, —2, 3) . 

Find the matrix for proj u (v) where u = (1, 5, 3) . 

Find the matrix for proj u (v) where u = (1, 0, 3) . 

Give an example of a 2 x 2 matrix A which has all its entries 
A 2 — A. A matrix which satisfies A 2 = A is called idempotent. 

Let A be an m x n matrix and let B be an n x m matrix where r. 
AB cannot have an inverse. 

Find ker (A) for 

12 3 2 1 

2 112 

14 4 3 3 

2 112 

Recall ker (A) is just the set of solutions to Ax = 0. 

22. If A is a linear transformation, and Ax p = b, show that the general solution to the 
equation Ax = b is of the form x p + y where y € ker (A) . By this I mean to show that 
whenever Az = b there exists y G ker (A) such that x p + y = z. For the definition of 
ker (A) see Problem 21. 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 



Linear Transformations 



23. Using Problem 21, find the general solutio 



12 3 2 1 

2 112 

14 4 3 3 

2 112 



24. Using Problem 21, find the general solutic 



12 3 2 1 

2 112 

14 4 3 3 

2 112 



to the following linear system. 

( XX \ 



\ x 5 ) 

to the following linear system 

/ x x \ 



25. Show that the function T u defined by T u (v) = v — proj u (v) is also a linear transfor- 
mation. 

26. If u = (1, 2, 3) , as in Example 9.3.22 and T u is given in the above problem, find the 
matrix A u which satisfies A u x = T u (x) . 

27. Suppose V is a subspace of F" and T : V -> P 5 i 
Show that there exists a basis for Im (T) = T(V) 



and that in ! his situation. 



inearly independent. 



{T Vl ,. 
{vi,- 



,Tv m } 



28. tl n the situation of Problem 27 where V is a subspace of F", show that there exists 
{zi, • • • , z r } a basis for ker (T) . (Recall Theorem 2.4.12. Since ker (T) is a subspace, 
it has a basis.) Now for an arbitrary TV £ T (V) , explain why 

TV = oiTvi + • • • + a m Tv m 

and why this implies 

v-( ai v 1 + ... + a m v m )eker(T). 

Then explain why V — span (v 1; • • ■ , v m , Zi, • • • , z r ) . 

29. tl n the situation of the above problem, show {vi, • • • , v m , z 1: • • • , z r } is a basis for V 
and therefore, dim (V) = dim (ker (T)) + dim (T (V)) . 

30. fLet A be a linear transformation from V to W and let Bbea linear transformation 
from W to U where V, W, U are all subspaces of some F p . Explain why 

A (ker (BA)) C ker (B) , ker {A) C ker (BA) . 
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31. fLet{xi,--- , x n } be a basis of ker (A) and let {Ayi, ■ ■ ■ , Ay m } be a basis of A (ker (BA)). 
Let z e ker (BA) . Explain why 

Az e span{Ayi,--- , Ay rn } 

and why there exist scalars a^ such that 

A(z ~ (a lYl + ■ ■ ■ + a m y m )) = 

and why it follows z — (aiyi + • • • + a m y m ) G span {xi, ■ ■ • , x„}. Now explain why 

ker (BA) C spanjxi, • • • ,x n ,y 1 ,--< ,y m } 

and so 

dim (ker (BA)) < dim (ker (B)) + dim (ker (A)) . 
This important inequality is due to Sylvester. Show that equality holds if and only if 
A(kcrBA)=kcr(B). 

32. Generalize the result of the previous problem to any finite product of linear mappings. 
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33. UWCV for W, V two subspaccs of F" and if dim (W) = dim (V) , show W = V. 

34. Let V be a subspace of F ra and let Vi, ■ ■ ■ , V m be subspaces, each contained in V. Then 

v = Vi © • • • © v m (2.35) 

if every v £ V can be written in a unique way in the form 

where each v t £ Vi . This is called a direct sum. If this uniqueness condition does not 
hold, then one writes 

V = Vi + --- + V m 
and this symbol means all vectors of the form 

Vi + • • • + v m , Vj £ Vj for each j. 
Show 2.35 is equivalent to saying that if 

= vi -\ h v m , Vj £ Vj for each j, 

then each Vj = 0. Next show that in the situation of 2.35, if fi i — {u\, • • • , u l mq } is a 
basis for Vi, then {/3 1; • • • , (3 m } is a basis for V. 

35. fSuppose you have finitely many linear mappings L\, L 2l ■ ■ • , L m which map V to V 
where V is a subspace of ¥ n and suppose they commute. That is, LiLj — LjLi for all 
i,j. Also suppose Lk is one to one on ker (Lj) whenever j ^ k. Letting P denote the 
product of these linear transformations, P = L\L 2 ■ ■ ■ L m: first show 

ker (Li) + • • ■ + ker (L m ) C ker (P) 

Next show Lj : ker (Li) — > ker (Lj) . Then show 

ker (Li) -\ h ker (L m ) = ker (L t ) • • • ker (L T „) . 

Using Sylvester's theorem, and the result of Problem 33, show 

ker (P) = ker (L x ) © • • • © ker (L m ) 

Hint: By Sylvester's theorem and the above problem, 

dim (ker (P)) < ^ dim (ker (L,)) 

= dim (ker (L x ) © • • • © ker (L m )) < dim (ker (P)) 

Now consider Problem 33. 

36. Let M (F n , F n ) denote the set of all n x n matrices having entries in F. With the usual 
operations of matrix addition and scalar multiplications, explain why M. (F™,F n ) can 
be considered as F" 2 . Give a basis for M (F",F") . If A £ M (F n ,F n ) , explain why 
there exists a monic (leading coefficient equals 1) polynomial of the form 

such that 

A k + a k - 1 A k ~ 1 H h a x A + a I = 
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The minimal polynomial of A is the polynomial like the above, for which p (A) — 
which has smallest degree. I will discuss the uniqueness of this polynomial later. Hint: 
Consider the matrices I,A,A 2 ,--- ,A n . There are n 2 + 1 of these matrices. Can they 
be linearly independent? Now consider all polynomials and pick one of smallest degree 
and then divide by the leading coefficient. 

37. fSuppose the field of scalars is C and A is an n x n matrix. From the preceding 
problem, and the fundamental theorem of algebra, this minimal polynomial factors 

(A-A 1 ) ri (A-A 2 y r2 -..(A-A fc ) rfc 

where Tj is the algebraic multiplicity of Xj, and the A^ are distinct. Thus 

{A - X.IY 1 (A - \ 2 I) r2 ---{A- X k I) r « = 

and so, letting P = (A - XJ) ri {A - X 2 I) r2 ■■■(A- X k I) r " and Lj = {A- A^-Ip 
apply the result of Problem 35 to verify that 

C n =ker(Ii)®---eker(L fe ) 

and that A : ker (Lj) — > ker(Lj). In this context, ker(Lj) is called the generalized 
eigenspace for Xj. You need to verify the conditions of the result of this problem hold. 

38. In the context of Problem 37, show there exists a nonzero vector x such that 

(A - Xjl) x = 0. 

This is called an eigenvector and the Xj is called an eigenvalue. Hint:There must exist 
a vector y such that 

(A - Ai/) ri (A - X 2 I) r2 ---(A- XjlY^ 1 ---{A- X k I) rk y = z ^ 

Why? Now what happens if you do (A - Xjl) to z? 

39. Suppose Q (t) is an orthogonal matrix. This means Q (t) is a real n x n matrix which 
satisfies 

Q{t)Q{tf =1 

Suppose also the entries of Q (t) are differentiable. Show (Q T ) — —Q T Q'Q T . 

40. Remember the Coriolis force was 2fl x v^ where tt was a particular vector which 
came from the matrix Q (t) as described above. Show that 

/ i (t) - i (* ) j (t) ■ i (to) k(*)-i(*o) 

Q(t)= i (t) ■ j (t ) j(i)-j(i ) k(i)-j(t„) 

V l(t)-k(t ) j(t)-k(t ) k(t)-k(to) 

There will be no Coriolis force exactly when $7 = which corresponds to Q' (t) = 0. 
When will Q' (t) = 0? 

41. An illustration used in many beginning physics books is that of firing a rifle hori- 
zontally and dropping an identical bullet from the same height above the perfectly 
flat ground followed by an assertion that the two bullets will hit the ground at ex- 
actly the same time. Is this true on the rotating earth assuming the experiment 
takes place over a large perfectly flat field so the curvature of the earth is not an 
issue? Explain. What other irregularities will occur? Recall the Coriolis acceleration 
is 2lo [(— y' cos 0) i+ (x' cos (j) + z' sin 0) j — (y' sin <f>) k] where k points away from the 
center of the earth, j points East, and i points South. 
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3.1 Basic Techniques And Properties 

Let A be an n x n matrix. The determinant of A, denoted as det (A) is a number. If the 
matrix is a 2x2 matrix, this number is very easy to find. 



del 

edl 

fa b \ _ I a 
[c d)-\ c 



Definition 3.1.1 Let A 

det (A) = ad — cb. 
The determinant is also often denoted by enclosing the matrix with two vertical lines. Thus 



Example 3.1.2 Find det I I . 

From the definition this is just (2) (6) - (-1) (4) = 16. 

Assuming the determinant has been defined for k x k matrices for k < n — 1, it is now 
time to define it for n x n matrices. 

Definition 3.1.3 Let A — (a^) be an nx n matrix. Then a new matrix called the cofactor 
matrix, cof (A) is defined by cof (A) = (cij) where to obtain Cij delete the i th row and the 
j th column of A, take the determinant of the (n — 1) x (n — I) matrix which results, (This 
is called the ij th minor of A. ) and then multiply this number by (— l) 1 3 . To make the 
formulas easier to remember, cof [A) i - will denote the ij th entry of the cofactor matrix. 

Now here is the definition of the determinant given recursively. 

Theorem 3.1.4 Let A be an n x n matrix where n>2. Then 

det {A) = J2 a l0 cof (A) tf = ^ a y cof {A) tj . (3.1) 

The first formula consists of expanding the determinant along the i th row and the second 
expands the determinant along the j th column. 
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Note that for a n x n matrix, you will need n! terms to evaluate the determinant in this 
way. If n = 10, this is 10! = 3, 628 , 800 terms. This is a lot of terms. 

In addition to the difficulties just discussed, why is the determinant well defined? Why 
should you get the same thing when you expand along any row or column? I think you 
should regard this claim that you always get the same answer by picking any row or column 
with considerable skepticism. It is incredible and not at all obvious. However, it requires 
a little effort to establish it. This is done in the section on the theory of the determinant 
which follows. 

Nohvithsl'andm.!', the difficulties involved in using the method of Laplace expansion, 
certain types of matrices are very easy to deal with. 

Definition 3.1.5 A matrix M, is upper triangular if M^ = whenever i > j. Thus such 
a matrix equals zero below the main diagonal, the entries of the form Ma , as shown. 



A lower triangular matrix is defined similarly as a matrix for which all entries above the 
main diagonal are equal to zero. 

You should verify the follow-in;', usin;; the above theorem on Laplace expansion. 

Corollary 3.1.6 Let M be an upper (lower) triangular matrix. Then det (M) is obtained 
by taking the product of the entries on the main diagonal. 

Proof: The corollary is true if the matrix is one to one. Suppose it is n x n. Then the 

matrix is of the form 



/ TOn a \ 
{ M 1 J 



where Mi is (n — 1) X (n — 1) . Then expanding along the first row, you get mn det (Mi) +0. 
Then use the induction hypothesis to obtain that det (Mi) = J\™ =2 m "- ' 

Example 3.1.7 Let 

'12 3 77 

2 6 7 

3 33.7 

-1 

Find det (A) . 

From the above corollary, this is —6. 
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There are many properties satisfied by determinants. Some of the most important are 

listed in the following theorem. 

Theorem 3.1.8 If two rows or two columns in an n x n matrix A are switched, the deter- 
minant of the resulting matrix equals (— f) times the determinant of the original matrix. If 
A is annxn matrix in which two rows are equal or two columns are equal then det (A) = 0. 
Suppose the i th row of A equals (xa\ + yb\, ■ ■ ■ ,xa n + yb n ). Then 
det (A) = x det {Ax) + y det (A 2 ) 

where the i th row of A\ is (a,\, ■ ■ ■ ,a n ) and the i th row of A 2 is (bi, ■ ■ ■ 7 b n ) , all other rows 
of A\ and A 2 coinciding with those of A. In other words, det is a linear function of each 
row A. The same is true with the word "row" replaced with the word "column". In addition 
to this, if A and B are n x n matrices, then 

det (AB) = det (A) det (B) , 

and if A is an n x n matrix. th< n. 

det (A) = det (A T ) . 




this ebook is produced with iText 
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This theorem implies the following corollary which gives a way to find determinants. As 
I pointed out above, the method of Laplace expansion will not be practical for any matrix 
of large size. 

Corollary 3.1.9 Let A be annxn matrix and let B be the matrix obtained by replacing the 
jth row ( cc .lumn) of A with the sum of the i th row (column) added to a multiple of another 
row (column). Then det (A) = det (B) . If B is the matrix obtained from A be replacing the 
jth row ( co i umn j f A by a times the i th row (column) then a det (A) = det (B) . 

Here is an example which shows how to use this corollary to find a determinant. 

Example 3.1.10 Find the determinant of the matrix 

2 3 4 
f 2 3 
5 4 3 



Replace the second row by (—5) times the first row added to it. Then replace the third 
row by (—4) times the first row added to it. Finally, replace the fourth row by (—2) times 
the first row added to it. This yields the matrix 



and from the above corollary, it has the same determinant as A. Now using the corollary 
some more, det (B) = (^p) det (C) where 



-3 -8 -13 



The second row was replaced by (—3) times the third row added to the second row and then 
the last row was multiplied by (—3) . Now replace the last row with 2 times the third added 
to it and then switch the third and second rows. Then det (C) = — det (D) where 

12 3 4 
-3 -8 -13 



You could do more row operations or you could note that this can be easily expanded along 
the first column followed by expanding the 3x3 matrix which results along its first column. 
Thus 

det (D) = 1 (-3) I J I 22 l7 I = 1485 

and so det (C) = -1485 and det (A) = det (B) = (^) (-1485) = 495. 

The theorem about expanding a matrix along any row or column also provides a way to 
give a formula for the inverse of a matrix. Recall the definition of the inverse of a matrix 
in Definition 2.1.22 on Page 61. The following theorem gives a formula for the inverse of a 
matrix. It is proved in the next section. 
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Theorem 3.1.11 A- 1 exists if and only if dct( A) ^ 0. If det(A) ^ 0, then A" 1 = (o^. 1 ) 
where 

ar} =det(A)- 1 cof(A) ji 

forcoi{A) i - the ij th cofactor of A. 

Theorem 3.1.11 says that to find the inverse, take the transpose of the cofactor matrix 
and divide by the determinant. The transpose of the cofactor matrix is called the adjugate 
or sometimes the classical adjoint of the matrix A. It is an abomination to call it the adjoint 
although you do sometimes see it referred to in this way. In words, A~ x is equal to one over 
the determinant of A times the adjugate matrix of A. 

Example 3.1.12 Find the inverse of the matrix 

2 3 

1 



First find the determinant of this matrix. This is seen to be 12. The cofactor matrix of 



Each entry of A was replaced by its cofactor. Therefore, from the above theorem, the 
of A should equal 
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This way of finding inverses is especially useful in the case where it is desired to find the 
inverse of a matrix whose entries are functions. 

Example 3.1.13 Suppose 

/ e* 

A(t) = cost sint 
y — sin t cos i 

Find A^y 1 . 

First note det (A (t)) = e*. A routine computation using the above theorem shows that 
this inverse is 



— e sint e cost 



sin t cos t 



Real individuality. i£T£«o^ 
inreal togetherness, f^f olTatili^t^ti^VoMhLTo^Aifwe'e 
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This formula for the inverse also implies a famous procedure known as Cramer's rule. 
Cramer's rule gives a formula for the solutions, x, to a system of equations, Ax = y. 

In case you are solving a system of equations, Ax = y for x, it follows that if A^ 1 exists, 



x = (A- 1 A) x = A- 1 (Ax) = A~ l y 



thus solving the system. Now i] 
above. Using this formula, 



3 that A x exists, there is a formula for A x given 






By the formula for the expansion of a determinant along a column, 



where here the i th column of A is replaced with the column vector, (y 1 ■ ■ ■ -,y n ) , and the 
determinant of this modified matrix is taken and divided by det (A) . This formula is known 
as Cramer's rule. 

Procedure 3.1.14 Suppose A is an n x n matrix and it is desired to solve the system 
Ax = y, y = (yi, ■ ■ ■ , y„) for x — (xi, • • • , x n ) . Then Cramer's rule says 

_ dctA t 
Xl ~ dct A 

where Ai is obtained from A by replacing the i th column of A with the column (yi, • • • , y n ) ■ 

The following theorem is of fundamental importance and ties together many of the ideas 
presented above. It is proved in the next section. 

Theorem 3.1.15 Let A be c 

1. A is one to one. 

2. A is onto. 

3. det (A) ^ 0. 



i, n x n matrix. Then the following are equivalent. 



3.2 Exercises 

1. Find the determinants of the following matrices. 
2 3 N 



(a) 



3 2 2 
9 8 



(The 



1 7 8 (Tb 



s31.) 
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(The 



If A 1 exist, what is the relationship between det (A) and det [A 1 ) . Explain your 



Let A be an n x n matrix where n is odd. Suppose also that A is skew symmetric. 
This means A T = -A. Show that det (A) = 0. 

Is it true that det (A + B) = det (A) + det (B)l If this is so, explain why it is so and 
if it is not so, give a counter example. 

Let A be an r x r matrix and suppose there are r—\ rows (columns) such that all rows 
(columns) are linear combinations of these r — 1 rows (columns). Show det (A) = 0. 

Show det (aA) = a n det (A) where here A is an n x n matrix and a is a scalar. 

Suppose A is an upper triangular matrix. Show that A^ 1 exists if and only if all 
elements of the main diagonal are non zero. Is it true that A" 1 will also be upper 
triangular? Explain. Is everything the same for lower triangular matrices? 

8. Let A and B be two n x n matrices. A ~ B (A is similar to B) means there exists an 
invertible matrix S such that A = S~ 1 BS. Show that if A ~ B, then B ~ A. Show 
also that A — A and that \{ A ~ B and B ~ C, then A~ C. 

9. In the context of Problem 8 show that if A ~ B, then det (A) = det (B) . 

10. Let A be an n x n matrix and let x be a nonzero vector such that Ax — Ax for some 
scalar, A. When this occurs, the vector, x is called an eigenvector and the scalar, A 
is called an eigenvalue. It turns out that not every number is an eigenvalue. Only 
certain ones are. Why? Hint: Show that if Ax = Ax, then (XI — A) x = 0. Explain 
why this shows that (XI — A) is not one to one and not onto. Now use Theorem 3.1.15 
to argue det (XI — A) = 0. What sort of equation is this? How many solutions does it 
have? 

11. Suppose det (XI — A) = 0. Show using Theorem 3.1.15 there exists x ^ such that 
(XI-A)x = 0. 

'*<*> fc (*>Y Verify 

iv /^ j ( «' (*) V(t) \ j fa (t) b (t) \ 

^)=n4) rf(t)) +de V(t) /(/J- 

Now suppose 

/o(t) 6(t) c(i) 
F(t) = det d(t) e(t) }(t) 
\g(t) h(t) i(t) 
Use Laplace expansion and the first part to verify F' (t) = 

l'(t) b'(t) c'(t) \ / a(t) b(t) c(t) 

det | d(t) e(t) f(t) Udet d! (t) e> (t) f (t) 
g(t) h(t) i(t) J \ g(t) h(t) i{t) 

a(t) b(t) c(t) 
d(t) e(t) f(t) 
g'(t) h'(t) i'(t) 
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Conjecture a general result valid for n x n matrices and explain why it will be true. 
Can a similar thing be done with the columns? 

13. Use the formula for the inverse in terms of the cofactor matrix to find the 
the matrix 



14. Let A be an r x r matrix and let B be an m x m matrix such that r 
the following n x n block matrix 



where the D is an m x r matrix, and the is a r x m matrix. Letting Ik denote the 
k x k identity matrix, tell why 

Now explain why dct (C) = det (A) dct (B) . Hint: Part of this will require an expla- 
nation of why 

det (l) I m )= det (^)- 
See Corollary 3.1.9. 

15. Suppose Q is an orthogonal matrix. This means Q is a real n x n matrix which satisfies 

QQ T = I 
Find the possible values for det (Q). 

16. Suppose Q (t) is an orthogonal matrix. This means Q (t) is a real n x n matrix which 
satisfies 

Q(t)Q(tf = I 

Suppose Q (t) is continuous for t e [a, b] , some interval. Also suppose det (Q (£)) = 1. 
Show that it follows det (Q (t)) = 1 for all t € [a, b]. 
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3.3 The Mathematical Theory Of Determinants 

It is easiest to give a different definition of the determinant which is clearly well defined 
and then prove the earlier one in terms of Laplace expansion. Let («i, • • • , i n ) be an ordered 
list of numbers from {1, • • • , n} . This means the order is important so (1, 2, 3) and (2, 1, 3) 
are different. There will be some repetition between this section and the earlier section on 
determinants. The main purpose is to give all the missing proofs. Two books which give 
a good introduction to determinants are Apostol [1] and Rudin [22]. A recent book which 
also has a good introduction is Baker [3] 

3.3.1 The Function sgn 

The following Lemma will be essential in the definition of the determinant. 
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Lemma 3.3.1 There exists a unique function, sgn n which maps each ordered list of num- 
bers from {1, • • • ,n} to one of the three numbers, 0, 1, or — 1 which also has the following 
properties. 

sgn„(l,--- ,n) = l (3.2) 

sgn„(«i,-" ,P,--- ,«,-•■ ,in) = -sgn n (?!,-•• ,<?,••• ,p, ■■■ ,«„) (3.3) 

In words, the second property states that if two of the numbers are switched, the value of the 
function is multiplied by — 1. Also, in the case where n > 1 and {i\, • • • , z„} = {1. • ■ ■ , n} so 
that every number from {1, ■ ■ • , n} appears in the ordered list, (i\, ■ ■ ■ , i n ) , 

sgn„ («!,••• ,ie-i,n,i 9+ x,--- ,i n ) = 

(-lf-^sgn^! (»i,.-- .ifl-i,*^!,--- ,v> (3.4) 

w/iere n — ig in the ordered list, (i\,- ■ ■ ,i n ) ■ 

Proof: To begin with, it is necessary to show the existence of such a function. This is 
clearly true if n = 1. Define sgn 2 (1) = 1 and observe that it works. No switching is possible. 
In the case where n = 2, it is also clearly true. Let sgn 2 (1,2) = 1 and sgn 2 (2,1) = —1 
while sgn 2 (2, 2) = sgn 2 (1, 1) = and verify it works. Assuming such a function exists for n, 
sgn n+1 will be defined in terms of sgn„ . If there are any repeated numbers in (ii , • • • , i n +\) , 
sgn n+1 (ii, • • • ,i n +i) = 0. If there are no repeats, then n + 1 appears somewhere in the 
ordered list. Let be the position of the number n + 1 in the list. Thus, the list is of the 
form (?i, • • • , ig-i,n + 1, ie+i, • • • , i n +i) ■ From 3.4 it must be that 

sgn n+1 (ii,--- ,i e ^i,n + l,io+i,--- iWO - 

(-l)" +1_e sgn n (ii,--- ,i 9 -i,i 9 +i,--- ,i n +i) ■ 

It is necessary to verify this satisfies 3.2 and 3.3 with n replaced with n + 1. The first of 
these is obviously true because 



If there are repeated numbers in (ii, • • • ,i n +i) , then it is obvious 3.3 holds because both 
sides would equal zero from the above definition. It remains to verify 3.3 in the case where 
there are no numbers repeated in (z'i, • • • , i n +i) ■ Consider 



ignn+i (*i 

icates 
/ is in 



where the r above the p indicates the number p is in the r th position and the s above the q 
indicates that the number, q is in the s th position. Suppose first that r < 6 < s. Then 



(-D' 
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and so, by induction, a switch of p and q introduces a minus sign in the result. Similarly, if 
9 > s or if 9 < r it also follows that 3.3 holds. The interesting case is when = r or 9 — s. 
Consider the case where 9 = r and note the other case is entirely similar. 

s gn»+i (*i> • • • , n + 1, • • • , q, • ■ • , i n +i) = 

(-l) B+1 - r 8gn n (*!,•■■ ,'?,-' »Wi) (3-5) 

while 

sgn„ + i (ii, • • • , q, ■ ■ • , n + 1, • • • , i n+1 ) = 

(-l) n+1 - s sgn„(i 1 ,---,5,---,i„ +1 ). (3.6) 

By making s — 1 — r switches, move the q which is in the s — 1 th position in 3.5 to the r th 
position in 3.6. By induction, each of these switches introduces a factor of — 1 and so 

sg n n (h,'" , S q .••• ,Wi) = (-l) s_1 " r sgn n (ii,~- ,5, ••• ,i n +i) ■ 

Therefore. 

s g n n+i (ii,--- ,n+l,--- ,q,--- ,in+i) = (-l)" +1_r sgn„ [h,--- /g ,■•• ,i„+i) 



sgn n Ui,- •• ,q. 






This proves the existence of the desired function. Uniqueness follows easily from the follow- 
ing lemma. 

Lemma 3.3.2 Every ordered list of {1, 2, • • • , n} can be obtained from every other ordered 
list by a finite number of switches. Also, sgn is unique. 

Proof: This is obvious if n = 1 or 2. Suppose then that it is true for sets of n — 1 
elements. Take two ordered lists of numbers, P\,P2- To get from P\ to P2 using switches, 
first make a switch to obtain the last element in the list coinciding with the last element of 
P 2 - By induction, there are switches which will arrange the first n — 1 to the right order. 

To see sgn n is unique, if there exist two functions, / and g both satisfying 3.2 and 3.3, 
you could start with / (1, • • • , n) = g (1, • • • ,n) and applying the same sequence of switches, 
eventually arrive at / («i, • • • , i n ) — g («i, • • • , in) ■ If any numbers are repeated, then 3.3 
gives both functions are equal to zero for that ordered list. ■ 

Definition 3.3.3 When you have an ordered list of distinct numbers from {1, 2, • ■ • , n} , say 

(ii, • • • , i n ) , this ordered list is called a permutation. The symbol for all such permutations 
is S n . The number sgn n {i\, ■ ■ ■ ,i n ) is called the sign of the permutation. 

A permutation can also be considered as a function from the set 

{1,2,--- ,n} to {1,2,--- ,n} 

as follows. Let f (k) — ik- Permutations are of fundamental importance in certain areas 
of math. For example, it was by considering permutations that Galois was able to give a 
criterion for solution of polynomial equations by radicals, but this is a different direction 
than what is being attempted here. 

In what follows sgn will often be used rather than sgn n because the context supplies the 
appropriate n. 
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3.3.2 The Definition Of The Determinant 

Definition 3.3.4 Let f be a real valued function which has the set of ordered lists of numbers 
from {1, • • • , n} as its domain. Define 

E f(ki---k n ) 



to be the sum of all the f (fci • • • k n ) for all possible choices of ordered lists (fci, ■ ■ ■ , k n ) of 
numbers of {1, • • • , n} . For example, 

J2 /(fci,*2) = /(l,2) + /(2,l) + /(l,l) + /(2,2). 

(fei,fc 2 ) 



Definition 3.3.5 Let (a, 

by det (A) is defined by 



A denote an n x n matrix. The determinant of A, denoted 
det (.A) ss Y. sgn(fci,--- ,k n )a lkl ---a nkn 



where the sum is taken over all ordered lists of numbers from {1, ■ ■ ■ , n\. Note it suffices to 
take the sum over only those ordered lists in which there are no repeats because if there are, 
sgn (fci, • • • , k n ) = and so that term contributes to the sum. 

Let A be an n x n matrix A = (a^) and let (ri, • • • , r n ) denote an ordered list of n 
numbers from {1, • • • ,n}. Let A {r\, ■ ■ ■ ,r n ) denote the matrix whose k th row is the r k row 
of the matrix A. Thus 



r n ))= J2 s S n ( fcl ' 
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and A(l,--- ,n) = A. 

Proposition 3.3.6 Let (r 1; • • • , r n ) be an ordered list of numbers from {1, • • • ,n}. Then 

sgn(ri,.»,r„)det(A) = £ sgn (*!,••• ,k n )a rikl ■■ -a r „ fc „ (3.8) 

(fei,-,fc„) 
= det(A(n,--- ,r„)). (3.9) 

Proof: Let (1, • • • , n) = (1, • • • , r, • • • s, • • • ,n) so r < s. 

det (A (!.,••• ,r,--- ,»,■■• ,n))= (3.10) 

X! sgn(fci,--- ,fer,--- , fc S) --- ,k n )ai kt ■•■Orkr ■■■ask, ■■■a nkn , 

(fcl,-,fen) 

and renaming the variables, calling fc s , fc r and k r , k s , this equals 



y^ sgn(fei, • • • , k s , • • ■ , k r ,- ■ ■ ,k n ) aiki ■ ■ ■ a r k 
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-sgn J &!,-•• 

,r, ■■•,»))• (3.H) 

Consequently, 

det (A (1, • • • , s, ■ ■ ■ , r, ■ ■ ■ ,n)) = - det (A (1, • • • ,r, • • • , s, ■ ■ ■ ,n)) = - det (A) 

Now letting A (1, • • • , s, • • • , r, ■ ■ ■ ,n) play the role of A, and continuing in this way, switch- 
ing pairs of numbers, 

det (A (n,— ,r„)) = (-l) p det(A) 

where it took p switches to obtain^ , • ■ ■ , r n ) from (1, ■ • • , ri). By Lemma 3.3.1, this implies 

det {A (n, • • • , r„)) = (-If det (A) = sgn (n, • • • , r„) det (A) 

and proves the proposition in the case when there are no repeated numbers in the ordered 
list, (ri, • • • , r n ). However, if there is a repeat, say the r th row equals the s th row, then the 
reasoning of 3.10 -3.11 shows that det (A (ri, • • • , r„)) = and also sgn (ri, • • • , r„) = so 
the formula holds in this case also. ■ 

Observation 3.3.7 There are n\ ordered lists of distinct numbers from {1, • • • , n} . 

To see this, consider n slots placed in order. There are n choices for the first slot. For 
each of these choices, there are n — 1 choices for the second. Thus there are n(n— 1) ways 
to fill the first two slots. Then for each of these ways there are n — 2 choices left for the third 
slot. Continuing this way, there are n! ordered lists of distinct numbers from {1, • ■ ■ ,n} as 
stated in the observation. 

3.3.3 A Symmetric Definition 

With the above, it is possible to give a more symmetric description of the determinant from 
which it will follow that det (A) = det (A T ) . 

Corollary 3.3.8 The following formula for det (A) is valid. 

dct(A) = — ] - Y, Yl sgn(ri,--- , r„) sgn (fc 1; • • • ,k n ) a rikl ■ ■ ■ a rnkn . (3.12) 

(n,-,r n ) (>!,.••,/=„) 

And also det (A T ) = det (A) where A T is the transpose of A. (Recall that for A T = (afj). 

Proof: From Proposition 3.3.6, if the r* are distinct, 

det (A) = yj sgn(ri, • • • , r„) sgn(fc 1: • • • , k n ) a ri k 1 • • • a rn k n - 
(fci,-,fc„) 

Suniming ovc^r all ordc^rcd lists. (r\_.--- ,r n ) where the r-j are distinct, (If the n are not 
distinct, sgn (ri, • • • , r n ) = and so there is no contribution to the sum.) 

n\det(A)= J2 E 8gn(r 1 ,-..,r n )sgn(fc 1> ...,fc„)o Plfcl -o rnfcll . 

This proves the corollary since the formula gives the same number for A as it does for A T . 
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Corollary 3.3.9 If two rows or two columns in an n x n matrix A, are switched, the 
determinant of the resulting matrix equals (— 1) times the determinant of the original matrix. 
If A is annxn matrix in which two rows are equal or two columns are equal then det (A) = 0. 
Suppose the i th row of A equals {xa\ + yb\, • • • , xa n + yb n ). Then 

det (A) = x det (A{)+y det (A 2 ) 

where the i th row of Ai is(ai,--- ,a n ) and the i th row of A 2 is(bi,--- ,b n ) , all other rows 
of A\ and A 2 coinciding with those of A. In other words, det is a linear function of each 
row A. The same is true with the word "row" replaced with the word "column". 

Proof: By Proposition 3.3.6 when two rows are switched, the determinant of the re- 
sulting matrix is (—1) times the determinant of the original matrix. By Corollary 3.3.8 the 
same holds for columns because the columns of the matrix equal the rows of the transposed 
matrix. Thus if Ai is the matrix obtained from A by switching two columns, 



det (A) = det (A T ) = - det (A[) 

If A has two equal columns or two equal rows, then s 
matrix. Therefore, det (A) = — det (A) and so det (A) = 
It remains to verify the last assertion. 

det (A) = 2_. s S n (&i ! " " " i k n ) aik 1 ■ ■ • I 



-det(Ai; 



ults in the same 



= x 2_. s S n (fcii ' ' ' > k n ) aifcj • • • a r fe 4 • • • a n k n 
(fci,-,fc„) 

+y ^ sgn(A;i,- •• , fc„)ai fcl •• -b rki ■■ ■ a nkn = xdet(Ai) + ydet(A 2 ). 
(fei,-,fc„) 

The same is true of columns because det (A TS j = det (A) and the rows of A T are the columns 
of A. m 



3.3.4 Basic Properties Of The Determinant 

Definition 3.3.10 A vector, w, is a linear combination of the vectors {vi, • • 
exist scalars c\, ■ ■ ■ c r such that w = Yl,'k=i c fc v A-- This is the same as saying 



v} if there 



The following corollary is also of great use. 

Corollary 3.3.11 Suppose A is an n x n matrix and some column (row) is a linear com- 
bination of r other columns (rows). Then det (A) = 0. 

Proof: Let A = ( ai • • • a„ ) be the columns of A and suppose the condition that 
one column is a linear combination of r of the others is satisfied. Then by using Corollary 
3.3.9 you may rearrange the columns to have the n th column a linear combination of the 
first r columns. Thus a n = Ylk=i c k a k and so 



det (A) = det ( a : 



a„- 



Efc=l C k a k ) ■ 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 

By Corollary 3.3.9 



det (A) — 2_] c fc det ( ai • • • a r 
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The case for rows follows from the fact that det (A) = det (A T ) . ■ 
Recall the following definition of matrix nmltiplicalion. 

Definition 3.3.12 If A and B are nx n matrices, A = (a^) and B = (hj), AB = (cv,-) 
where Cij = Ylk=i a ikbkj- 

One of the most important rules about determinants is that the determinant of a product 
equals the product of the determinants. 

Theorem 3.3.13 Let A and B be nx n matrices. Then 

det (AB) = det (A) det (B) . 

Proof: Let c^ be the ij th entry of AB. Then by Proposition 3.3.6, 

det (AB) = V" sgn (ki, ■ ■ ■ , k n ) c\k 1 ■ ■ ■ c n k n 

(fcl,-,fcn) 

= J2 sgn(k ir --,k n )(j2 a ^ b r 1 kX--(j2 a ™Jr„k n ) 

= X! Yl sgn(fci,--- ,k n )b rikl ---b rnkn (a iri ■■■a nr „) 



5Z sgn (ri • 



a„ r „ det (B) = det (A) det (B) .1 
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The Binet Cauchy formula is a generalization of the theorem which says the determinant 
of a product is the product of the determinants. The situation is illustrated in the following 
picture where A, B are matrices. 



H0 



Theorem 3.3.14 Let A be an n x m matrix with n > m and let B be amxn matrix. Also 
let A4 

i = l,---,C(n,m) 

be the m x m submatrices of A which are obtained by deleting n — m rows and let Bi be the 
mxm submatrices of B which are obtained by deleting corresponding n — m columns. Then 

C(n.,n) 



dct{BA)= Yj det (B k ) det (A k ) 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations Derterminants 

Proof: This follows from a computation. By Corollary 3.3.8 on Page 113, det (BA) = 
-Jr E E sgn( ll --- lm )sgnUi---J m )(BA) lin (BA) l2l2 ---(BA) tmhn 

—^ E E sgn{i 1 ---i m )sgn(j 1 ---j m )- 

J2 B iiri A rijl J2 B i2r2 A r . 2j2 ■■■ J2 R i m r m Ar m j m 
ri=l r 2 = l r m = l 

Now denote by Ik one of the r subsets of {1, • • • , n} . Thus there are C (n, to) of these. 

C{n,m) 1 

E E ^i E E s gn(Ji---« m )sgn(j 1 ---j m ) • 

B liri A rijl B l2T2 A r2J2 ■ ■ ■ B lmrm A rmJm 

C(n,m) 



E E ^ E «ga«i-<m) 



C(n,m) 

= E E -vBgnfn.-T^detCB^det^B 

k=i {n,- ,r m }=/ fc m - 

C(n,m) 

= E det(£ fc )det(40 



since there are m! ways of arranging the indices {r±, ■ ■ ■ ,r m }. 

3.3.5 Expansion Using Cofactors 
Lemma 3.3.15 Suppose a matrix is of the form 



■(*:) 



(3.13) 



where a is a number and A is an (n — 1) x (n — 1) matrix and * denotes either a column 
or a row having length n — 1 and the denotes either a column or a row of length n — 1 
consisting entirely of zeros. Then det (M) = a det (^4) . 

Proof: Denote M by (to.;.,) . Thus in the first case, m nn = a and m ni = if i ^ n while 
in the second case, m nn = a and m in — if i ^ n. From the definition of the determinant, 

det(M) = E sgn„(/ci,--- ,fc„)TOi fcl •••TO„ fc „ 
Oi,~,fc„) 
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Letting 8 denote the position of n in the ordered list, (fci, • 
conventions used to prove Lemma 3.3.1, det (M) equals 



,k n ) then using the earlier 



Now suppose 3.14. Then if k n ^ n, the term involving m n k n in the above expression equals 
zero. Therefore, the only terms which survive are those for which = n or in other words, 
those for which k n = n. Therefore, the above expression reduces to 



^ sgn„_i (fci,-- ■k n -i)mik 1 ■ 



a det (A). 



To get the assertion in the situation of 3.13 use Corollary 3.3.8 and 3.14 to write 

det (M) = det (M T ) = det ( ( ^ M J = a det (A T ) = a det (A) ■ 

In terms of the theory of determinants, arguably the most important idea is that of 
Laplace expansion along a row or a column. This will follow from the above definition of a 
determinant. 

Definition 3.3.16 Let A = (a.^ ) be annxn matrix. Then a new matrix called the cofactor 
matrix cof (A) is defined by cof (A) = (c.y) where to obtain Cij delete the i lh row and the 
j th column of A, take the determinant of the (n — 1) x (n — 1) matrix which results, (This 
is called the ij th minor of A. ) and then multiply this number by (—1)' "'. To make the 
formulas easier to remember, cof [A) i - will denote the ij th entry of the cofactor matrix. 

The following is the main 
totally unjustified assertion \v; 
the determinant along any ro 

Theorem 3.3.17 Let A be a 



result. Earlier this was given as a definition and the outrageous 
as made that the same number would be obtained by expanding 
w or column. The following theorem proves this assertion. 



det {A) = Y^ a u cof ( A )ij 



aij cot (A)^. 



The first formula consists of expanding the determinant along the i 
expands the determinant along the j th column. 



Proof: Let (a a , ■■■ , a in ) be the i th n 
leaving every row the same except the i 
Then by Corollary 3.3.9, 



v of A. Let Bj be the matrix obtained from A by 
row which in B 3 equals (0, • • • ,0, a ih 0, • • • , 0) . 



det(A) = ^det(B i ) 
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Denote by A 1 ? the (n — 1) x (n — 1) matrix obtained by deleting the i th row and the j th 
column of A. Thus cof (A)- ■ = (— l) t+J det (A lj ) . At this point, recall that from Proposition 
3.3.6, when two rows or two columns in a matrix M, are switched, this results in multiplying 
the determinant of the old matrix by —1 to get the determinant of the new matrix. Therefore, 
by Lemma 3.3.15, 

det(B,-) = (-1)"-' (-if* det ((^ fl *. )) 

= (-ir +J det((f a ].))=a ij coHA) ij . 

Therefore, 

det {A) = J2 <Hj cof (A) tj 

3=1 

which is the formula for expanding det (A) along the i th row. Also, 

det (A) = det (A T ) = ^T a% cof (A T ) = J^ aji cof (A) jt 
j— i i =1 

which is the formula for expanding det (A) along the i th column. ■ 

3.3.6 A Formula For The Inverse 

Note that this gives an easy way to write a formula for the inverse of an n x n matrix. Recall 
the definition of the inverse of a matrix in Definition 2.1.22 on Page 61. 



Theorem 3.3.18 A" 1 exists if and only if det (A) ^ 0. J/ det (A) ^ 0, then A" 1 = (a^ 1 ) 

a^ 1 =det(A)- 1 cof(A) ri 
for cof (A)., the ij th cofactor of A. 

Proof: By Theorem 3.3.17 and letting (a ir ) = A, if det (A) ^ 0, 

J2 <Hr cof (A) ir det(A)- 1 = det (A) dct(A)" 1 = 1. 

Now in the matrix A, replace the k th column with the r th column and then expand along 
the k th column. This yields for k ^ r, 

^a ir cof(yl) ife det(yl)- 1 =0 

because there are two equal columns by Corollary 3.3.9. Summarizing, 

J2 a ir cof {A) ik det (Ay 1 = S rk . 

Using the other formula in Theorem 3.3.17, and similar reasoning, 

J2 a rj coi (A) kj det (Ay 1 = 5 rk 
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This proves that if det (A) ^ 0, then A 1 exists with A 1 = (a^ 1 ), where 
Oy 1 = coi{A) ji det(A)~ 1 . 
Now suppose A^ 1 exists. Then by Theorem 3.3.13, 

1 = det (I) = det (AA- 1 ) = det (A) det (A' 1 ) 



so det (A) ^ 0. ■ 

The next corollary points out that if a 
it has an inverse. 



^ matrix A has a right o 



Corollary 3.3.19 Let A be an n x n matrix and suppose there exists an n x n matrix B 
such that BA = I. Then A^ 1 exists and A^ 1 = B. Also, if there exists C an n x n matrix 
such that AC — I, then A^ 1 exists and A^ 1 = C. 

Proof: Since BA = I, Theorem 3.3.13 implies 

det B det A = 1 

and so det A ^ 0. Therefore from Theorem 3.3.18, A" 1 exists. Therefore, 

A- 1 = (BA) A- 1 = B {AA- 1 ) = BI = B. 




Meet us on-campus this semester. 
Check out for more info. 
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The case where CA = I is handled similarly. ■ 

The conclusion of this corollary is that left inverses, right inverses and inverses are all 
the same in the context ofnxn matrices. 

Theorem 3.3.18 says that to find the inverse, take the transpose of the cofactor matrix 
and divide by the determinant. The transpose of the cofactor matrix is called the adjugate 
or sometimes the classical adjoint of the matrix A. It is an abomination to call it the adjoint 
although you do sometimes see it referred to in this way. In words, A^ 1 is equal to one over 
the determinant of A times the adjugate matrix of A. 

In case you are solving a system of equations, Ax = y for x, it follows that if A~ x exists, 

x = (A- 1 A) x = A- 1 (Ax) = A- l y 

thus solving the system. Now in the case that A~ x exists, there is a formula for A^ 1 given 
above. Using this formula, 



= £^ 1 w = Ed*o)« rf w*K' 
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By the formula for the expansion of a determinant along a column, 



where here the i th column of A is replaced with the column vector, (j/i • • • •, y n ) , and the 
determinant of this modified matrix is taken and divided by det (A) . This formula is known 
as Cramer's rule. 

Definition 3.3.20 A matrix M, 
a matrix equals zero below the ma 



i upper triangular if Mjj = whenever i > j. Thus such 
i diagonal, the entries of the form Ma as shown. 



V o ■■• o 

A lower triangular matrix is defined similarly as a matrix for which all entries above the 
main diagonal are equal to zero. 

With this definition, here is a simple corollary of Theorem 3.3.17. 

Corollary 3.3.21 Let M be an upper (lower) triangular matrix. Then det (M) is obtained 
by taking the product of the entries on the main diagonal. 

3.3.7 Rank Of A Matrix 

Definition 3.3.22 A submatrix of a matrix A is the rectangular array of numbers obtained 
by deleting some rows and columns of A. Let A be an m x n matrix. The determinant 
rank of the matrix equals r where r is the largest number such that some r x r submatrix 
of A has a non zero determinant. The row rank is defined to be the dimension of the span 
of the rows. The column rank is defined to be the dimension of the span of the columns. 

Theorem 3.3.23 If A, anmx n matrix has determinant rank r, then there exist r rows of 
the matrix such that every other row is a linear combination of these r rows. 

Proof: Suppose the determinant rank of A — (a^) equals r. Thus some rxr submatrix 
has non zero determinant and there is no larger square submatrix which has non zero 
determinant. Suppose such a submatrix is determined by the r columns whose indices are 



and the r rows whose indices are 



I want to show that every row is a linear combination of these rows. Consider the I th 
and let p be an index between 1 and n. Form the following (r + 1) x (r + 1) matrix 
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Of course you can assume I ^ {ii,- ■ ■ ,i r } because there is nothing to prove if the I th 
row is one of the chosen ones. The above matrix has determinant 0. This is because if 
P $■ {ji, ' ' ' ,jr} then the above would be a submatrix of A which is too large to have non 
zero determinant. On the other hand, if p <G {ji, • • • ,j r } then the above matrix has two 
columns which are equal so its determinant is still 0. 

Expand the determinant of the above matrix along the last column. Let Ck denote the 
cofactor associated with the entry di kP . This is not dependent on the choice of p. Remember, 
you delete the column and the row the entry is in and take the determinant of what is left 
and multiply by —1 raised to an appropriate power. Let C denote the cofactor associated 
with ai p . This is given to be nonzero, it being the determinant of the matrix 



= a lp C + ^2 c ka lkl 



a ip = Yl -£T a i kP = Yl rn k a ikP 

Since this is true for every p and since irik does not depend on p, this has shown the I th row 
is a linear combination of the ?'i, z'2, ■ • ■ , V rows. ■ 

Corollary 3.3.24 The determinant rank equals the row rank. 

Proof: From Theorem 3.3.23, every row is in the span of r rows where r is the deter- 
minant rank. Therefore, the row rank (dimension of the span of the rows) is no larger than 
the determinant rank. Could the row rank be smaller than the determinant rank? If so, 
it follows from Theorem 3.3.23 that there exist p rows for p < r = determinant rank, such 
that the span of these p rows equals the row space. But then you could consider the r x r 
sub matrix which determines the determinant rank and it would follow that each of these 
rows would be in the span of the restrictions of the p rows just mentioned. By Theorem 
2.4.4, the exchange theorem, the rows of this sub matrix would not be linearly independent 
and so some row is a linear combination of the others. By Corollary 3.3.11 the determinant 
would be 0, a contradiction. ■ 

Corollary 3.3.25 If A has determinant rank r, then there exist r columns of the matrix 
such that every other column is a linear combination of these r columns. Also the column 
rank equals the determinant rank. 

Proof: This follows from the above by considering A T . The rows of A T are the columns 
of A and the determinant rank of A T and A are the same. Therefore, from Corollary 3.3.24, 
column rank of A = row rank of A T = determinant rank of A T = determinant rank of A. 

The following theorem is of fundamental importance and ties together many of the ideas 
presented above. 

Theorem 3.3.26 Let A be an n x n matrix. Then the following are equivalent. 

1. Act {A) = 0. 
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2. A, A T are not one to one. 

3. A is not onto. 

Proof: Suppose det (A) — 0. Then the determinant rank of A = r < n. Therefore, 
there exist r columns such that every other column is a linear combination of these columns 
by Theorem 3.3.23. In particular, it follows that for some m, the m th column is a linear 
combination of all the others. Thus letting A — ( ai • • • a m • • • a„ ) where the 
columns are denoted by a^, there exists scalars a* such that 



Now consider the column vector, x = ( a\ ••• —1 ••• a n ) ■ Then 

Ax = — a m + Y^ a k a k = 0. 

Since also ^40 = 0, it follows A is not one to one. Similarly, A T is not one to one by the 
same argument applied to A T . This verifies that 1.) implies 2.). 

Now suppose 2.). Then since A T is not one to one, it follows there exists x ^ such that 

A T x = 0. 

Taking the transpose of both sides yields 

* T A = T 

where the T is a 1 x n matrix or row vector. Now if Ay = x, then 

|x| 2 = x T (Ay) = (x T A) y = Oy = 

contrary to x ^ 0. Consequently there can be no y such that Ay = x and so A is not onto. 
This shows that 2.) implies 3.). 

Finally, suppose 3.). If 1.) does not hold, then det (^4) ^ but then from Theorem 3.3.18 
A -1 exists and so for every y g F™ there exists a unique x e F" such that Ax = y. In fact 
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x = A 1 y. Thus A would be onto contrary to 3.). This shows 3.) implies 1.). ■ 
Corollary 3.3.27 Let A be an n x n matrix. Then the following are equivalent. 

1. det(A) jk 0. 

2. A and A T are one to one. 

3. A is onto. 

Proof: This follows immediately from the above theorem. 

3.3.8 Summary Of Determinants 

In all the following A, B are n x n matrices 

1. det (A) is a number. 

2. det (A) is linear in each row and in each column. 




' 



w 






It's only an 
opportunity if 
you act on it 
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3. If you switch two rows or two columns, the determinant of the resulting matrix is —1 
times the determinant of the unswitched matrix. (This and the previous one say 

(ai • • • a n ) — > det (ai • • ■ a„) 

is an alternating multilinear function or alternating tensor. 

det (ei, • • • ,e„) = 1. 

det (AB) = det (A) det (B) 

det (A) can be expanded along any row or any column and the same result is obtained. 

det (A) = det (A T ) 

A^ 1 exists if and only if det (A) ^ and in this case 

( A_ % = detW° f( ^ (3 ' 16) 

9. Determinant rank, row rank and column rank are all the same number for any m x n 
matrix. 

3.4 The Cayley Hamilton Theorem 

Definition 3.4.1 Let A be an n x n matrix. The characteristic polynomial is defined as 

PA (i) = det (tl - A) 

and the solutions to pa (t) = are called eigenvalues. For A a matrix and p (t) — t n + 
a n—it n 1 + • • ■ + a\t + a , denote by p (A) the matrix defined by 

p (A) = A n + a„_iA™ _1 H h a 1 A + a I. 

The explanation for the last term is that A is interpreted as I, the identity matrix. 

The Cayley Hamilton theorem states that every matrix satisfies its characteristic equa- 
tion, that equation defined by pa (i) = 0. It is one of the most important theorems in linear 
algebra 1 . The following lemma will help with its proof. 

Lemma 3.4.2 Suppose for all \X\ large enough, 

A + A ± X H h A m X m = 0, 

where the Ai are n x n matrices. Then each A t = 0. 

Proof: Multiply by A~ m to obtain 

A \- m + AiA~ m+1 + • • • + A m _iA _1 + A m = 0. 

Now let |A| ->ooto obtain A m — 0. With this, multiply by A to obtain 

A a \- m+l + AiX-"^ 2 + ■■■ + A m -i = 0. 

Now let |A| — ¥ oo to obtain A m _i = 0. Continue multiplying by A and letting A — > oo to 
obtain that all the Ai = 0. ■ 

With the lemma, here is a simple corollary. 
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Corollary 3.4.3 Let A t and Bi be n x n matrices and suppose 

A Q + A X X + ■ ■ ■ + A m \ m = B + BiA + • • • + B m X m 

for all |A| large enough. Then Ai = Bi for all i. Consequently if X is replaced by any nx n 
matrix, the two sides will be equal. That is, for C any n x n matrix, 

A + A X C +■■■+ A m C m = B + B X C +■■■ + B m C m . 

Proof: Subtract and use the result of the lemma. ■ 

With this preparation, here is a relatively easy proof of the Cayley Hamilton theorem. 

Theorem 3.4.4 Let A be annxn matrix and let p (A) = det (XI — A) be the characteristic 
polynomial. Then p (A) = 0. 

Proof: Let C (A) equal the transpose of the cofactor matrix of (XI — A) for |A| large. 
(If |A| is large enough, then A cannot be in the finite list of eigenvalues of A and so for such 
A, (XI - A)' 1 exists.) Therefore, by Theorem 3.3.18 

C(X)=p(X)(XI-A)- 1 . 

Note that each entry in C (A) is a polynomial in A having degree no more than n — 1. 
Therefore, collecting the terms, 

C (A) = Co + CiA + • • • + C^X 1 - 1 

for Cj some n x n matrix. It follows that for all |A| large enough, 

(XI -A)(C + C 1 X + --- + Cn-tX 71 - 1 ) = p (A) / 

and so Corollary 3.4.3 may be used. It follows the matrix coefficients corresponding to equal 
powers of A are equal on both sides of this equation. Therefore, if A is replaced with A, the 
two sides will be equal. Thus 

= (A - A) (Co + CiA + ■ ■ ■ + C„_i A"- 1 ) =p(A)I = p (A) ■ 

3.5 Block Multiplication Of Matrices 

Consider the following problem 

[c d){g h) 

You know how to do this. You get 

f AE + BG AF + BH \ 
\CE + DG CF + DH ) ' 

Now what if instead of numbers, the entries, A, B, C, D, E, F, G are matrices of a size such 
that the multiplications and additions needed in the above formula all make sense. Would 
the formula be true in this case? I will show below that this is true. 
3 A is a matrix of the form 



(3.17) 
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where A t j is a s, x pj matrix where s$ is constant for j = 1, • • • , m for each i = 1, • • • , r. 
Such a matrix is called a block matrix, also a partitioned matrix. How do you get the 
block Aijl Here is how for A an m x n matrix: 



( 7 SjXs . )A I PjXP] 


In the block column matrix on the right, you need to have Cj — 1 rows of zeros above the 
small pj x pj identity matrix where the columns of A involved in A t j are Cj , • • • , Cj + Pj — 1 
and in the block row matrix on the left, you need to have r t — 1 columns of zeros to the left 
of the Si x Si identity matrix where the rows of A involved in A t j are r t , ■ ■ ■ , r-j + s.;. An 
important observation to make is that the matrix on the right specifies columns to use in 
the block and the one on the left specifies the rows used. Thus the block Aij in this case 
is a matrix of size s t x pj . There is no overlap between the blocks of A. Thus the identity 
n x n identity matrix corresponding to multiplication on the right of A is of the form 





I PmX 

where these little identity matrices don't overlap. A similar conclusion follows from consid- 
eration of the matrices I Si x Si - Note that in 3.18 the matrix on the right is a block column 
matrix for the above block diagonal matrix and the matrix on the left in 3.18 is a block row 
matrix taken from a similar block diagonal matrix consisting of the I SiXSi . 

Next consider the question of multiplication of two block matrices. Let B be a block 
matrix of the form 



(3.19) 



and A is a block matrix of the form 



and that for all i,j, it makes sense to multiply B is A S j for all s £ {1, • • • 7 p}. (That is the 
two matrices, B is and A S j are conformable.) and that for fixed ij, it follows B is A S j is the 
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same size for each s so that it makes sense to write ^ s B is A S j. 

Tin! following theorem says essentially that when you take the product of two matrices, 
you can do it two ways. One way is to simply multiply them forming BA. The other way 
is to partition both matrices, formally multiply the blocks to get another block matrix and 
this one will be BA partitioned. Before presenting this theorem, here is a simple lemma 
which is really a special case of the theorem. 

Lemma 3.5.1 Consider the following product. 



where the first isnx 
and there are I zerc 



7(0/0) 



■ and the second is r 
rows above I and I 



. The small identity matrix I is anrxr matrix 
o columns to the left of I in the right matrix. 
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Then the product of these matrices is a block matrix of the form 
'000" 



v / 

Proof: From the definition of the way you multiply matrices, the product is 



which yields the claimed result. In the formula e^ refers to the column vector of length r 
which has a 1 in the j th position. ■ 

Theorem 3.5.2 Let B be a q x p block matrix as in 3.19 and let A be a px n block matrix 
as in 3.20 such that B is is conformable with A S j and each product, B is A S j for s = 1, ■ • • , p 



is of the same size so they c 
that the ij th block is of the for 



be added. Then BA < 











(0 Ivs 



Proof: From 3.18 

B ls A sj = ( I r%XTi )B 

where here it is assumed B is is r.; x p s and A S j 

block in the i th row of blocks for B and the s th block in the j th column of A. Thus there 
are the same number of rows above the I Ps x P , as there arc columns to the left of I Ps x Ps in 
those two inside matrices. Then from Lemma :'>.">. 1 



2 block matrix such 
(3.21) 

The product involves the t 



















|(o w o) = | 

Since the blocks of small identity matrices do not overlap, 

E 



y"s is A sj = V( o / rxr o)b\ i pxp ( o i p xp t 
V o / 

= ( I riXri )BJ2 I Ip.xp. ( J P .xp. )A 
= ( I nxri 0)BIA\ I qjXqj ) = ( / r<xri )BA I I qjX , 



which equals the ij th block of BA. Hence the ij th block of BA equals the formal multipli- 
cation according to matrix multiplication, J2 S Bi S A S j. ■ 
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Multiply it by B — I „ J where B is also an n x n matrix and Q is n — 1 X n — 1 . 

You use block multiplication 

fa bWp q \ _ / ap + br aq + bQ \ 

^ c PJl^r Q J ~ \pc + Pr cq + PQ J 

Note that this all makes sense. For example, b = 1 x n — 1 and r = n — 1 x 1 so br is a 
lxl. Similar considerations apply to the other blocks. 

Here is an interesting and significant application of block multiplication. In this theorem, 
Pm (t) denotes the characteristic polynomial, det (tl — M) . The zeros of this polynomial will 
be shown later to be eigenvalues of the matrix M. First note that from block multiplication, 
for the following block matrices consisting of square blocks of an appropriate size, 

/ A \ /4 0\/J 0\ 



del 



{b c)- det { A B O dCt (o c)= det ^ det ^ 



Theorem 3.5.4 Let A be anmx n matrix and let B be an n x m matrix for m < n. Then 

Pba (t) = t n ~ m p AB (t) , 

so the eigenvalues of BA and AB are the same includim n s except that BA has 

n — m extra zero eigenvalues. Here pa (t) denotes the characteristic polynomial of the matrix 



Proof: Use block multiplication to write 

f AB 0\f I A' 
{ B J l I 



AB ABA 
B BA 



{ I J { B BA ) { 



DA 



Since the two matrices above are similar, it follows that I . and I ) 

have the same characteristic polynomials. See Problem 8 on Page 106. Therefore, noting 
that BA is an n x n matrix and AB is an m x m matrix, 

t m det (tl - BA) = t n det (tl - AB) 

and so det (tl - BA) = p BA (t) = t n ~ m det (tl - AB) = t n - m p AB (t) . U 
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3.6 Exercises 



1. Let m < n and let A be an m x n matrix. Show that A is not one to one. Hint: 
Consider the n x n matrix Ai which is of the form 



where the denotes a,n (n — m) x n matrix of zeros. Thus det A\ — and so A\ is 
not one to one. Now observe that A\x. is the vector, 



which equals zero if and only if Ax. — 0. 

2. Let vi, ■ ■ • , v„ be vectors in F" and let M (vi, • • • , v„) denote the matrix whose i th 
column equals Vj. Define 

d(yi,>>> ,v„) = det(M(vi,--- ,v„)). 

Prove that d is linear in each variable, (multilinear), that 

d(vi,--- ,v h --- ,v jr - ,v n ) = -d(vi,..- ,v,-,-.. ,v 2 ,--- ,v„), (3.22) 

and 

d(ei,--- ,e„) = l (3.23) 

where here e^ is the vector in ¥ n which has a zero in every position except the j th 
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position in which it has a one. 

3. Suppose / : F™ x • • • x F™ — > F satisfies 3.22 and 3.23 and is linear in each variable. 
Show that f — d. 

4. Show that if you replace a row (column) of an n x n matrix A with itself added to 
some multiple of another row (column) then the new matrix has the same determinant 
as the original one. 

5. Use the result of Problem 4 to evaluate by hand the determinant 

12 3 2 
-6323 
5 2 2 3 
3 4 6 4 
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if it exists of the matrix 



7. Let Ly = ?/"' + a n _i (x) ?/" ^ + • • • + a x (x) y' + a (x) y where the a; are given 
continuous functions defined on an interval, (a, b) and y is some function which has n 
derivatives so it makes sense to write Ly. Suppose Ly k — for k — 1, 2, • • • , n. The 
Wronskian of these functions, yi is defined as 



W(yi,--- ,y n )(x) = det 



/ yi (x) ■■■ y n (x) \ 

y'i <» • • • y' n (x) 



V v ( r 1] (*) 



> (x) J 



Show that for W (x) = W (yi, ■ ■ ■ , y n ) (x) to save space, 



W (or) = dct 



Wi 



y« (*) \ 






Now use the differential equation, Ly = which is satisfied by each of these functions, 
j/i and properties of determinants presented above to verify that W + a n -i (x) W — 0. 
Give an explicit solution of this linear differential equation, Abel's formula, and use 
your answer to verify that the Wronskian of these solutions to the equation, Ly = 
either vanishes identically on (a, b) or never. 

. Two n x n matrices, A and B, are similar if B — S~ 1 AS for some invertible n x n 
matrix S. Show that if two matrices are similar, they have the same characteristic 
polynomials. The characteristic polynomial of A is det (XI — A) . 



9. Suppose the characteristic polynomial of ai 



i matrix A is of the form 



and that ao ^ 0. Find a formula A 1 in terms of powers of the matrix A. Show that 
A' 1 exists if and only if a ^ 0. In fact, show that a = (—1)™ det (A) . 

10. fLetting p (t) denote the characteristic polynomial of A, show that p e (t) = p(t — e) 
is the characteristic polynomial of A + el. Then show that if det (A) = 0, it follows 
that det (A + el) ^ whenever \e\ is sufficiently small. 

11. In constitutive modeling of the stress and strain tensors, one sometimes considers sums 
of the form ^2^ =0 akA k where A is a 3x3 matrix. Show using the Cayley Hamilton 
theorem that if such a thing makes any sense, you can always obtain it as a finite sum 
having no more than n terms. 

12. Recall you can find the determinant from expanding along the j th column. 

det (A) = J2 A v ( cof ( A ))ij 
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Think of det (A) as 
really just 



t function of the entries, A t j. Explain why the ij th cofactor i; 

ddet(A) 
dAij ' 



13. Let U be an open set in K™ and let g :U — ► M. n be such that all the first partial 
derivatives of all components of g exist and are continuous. Under these conditions 
form the matrix Dg (x) given by 



£g(x 



'i(x). 



d:r j 



9iJ ( x 



The best kept secret in calculus courses is that the linear transformation determined 
by this matrix Dg (x) is called the derivative of g and is the correct generalization 
of the concept of derivative of a function of one variable. Suppose the second partial 
derivatives also exist and are continuous. Then show that 



E(» f ( fl g))ii 



= 0. 



Hint: First explain why Y,i9i,k cof (Dg) i;j = 8 jk det(Dg). Next differentiate with 
respect to Xj and sum on j using the equality of mixed partial derivatives. Assume 
det (Dg) ^ to prove the identity in this special case. Then explain using Problem 10 
why there exists a sequence e k — > such that for g £fc (x) = g (x) +£^x, det (Dg Ek ) ^ 
and so the identity holds for g Ek . Then take a limit to get the desired result in general. 
This is an extremely important identity which has surprising implications. One can 
build degree theory on it for example. It also leads to simple proofs of the Brouwer 
fixed point theorem from topology. 

14. A determinant of the form 



is called a Vandermonde determinant. Show this determinant equals 

n (°i - «<) 

ii!.-;. j-_„ 

By this is meant to take the product of all terms of the form (a,j — a.;) such that j > i 
Hint: Show it works if n — 1 so you are looking at 



Then suppose it holds for n — 1 and consider the case n. Consider the polynomial in 
t,p (/;) which is obtained from the above by replacing the last column with the column 



(i ,. 



«■ ) T - 
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Explain why p (oj) = for i = 0, • • • ,n — l. Explain why 
p{t) = cf[(t-Oi). 

Of course c is the coefficient of t n . Find this coefficient from the above description of 
p(t) and the induction hypothesis. Then plug in t = a n and observe you have the 
formula valid for n. 
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4.1 Elementary Matrices 

The elementary matrices result from doing a row operation to the identity matrix. 
Definition 4.1.1 The row operations consist of the following 

1. Switch two rows. 

2. Multiply a row by a nonzero number. 

3. Replace a row by a multiple of another row added to it. 

The elementary matrices are given in the following definition. 

Definition 4.1.2 The elementary matrices consist of those matrices which result by apply- 
ing a row operation to an identity matrix. Those which involve switching rows of the identity 
are called permutation matrices. More generally, if {i\, 12, • • • ,i n ) is a. permutation, a ma- 
trix which has a 1 in the //,. position in row k and zero in every other position of that row is 
called a permutation matrix. Thus each permutation corresponds to a unique permutation 
matrix. 

As an example of why these elementary matrices are interesting, consider the following. 

d 



1 

1 
1 



x y z 
f 9 h 



x y 
a b 
I 9 h 



A 3 x 4 matrix was multiplied on the left by an elementary matrix which was obtained from 
row operation 1 applied to the identity matrix. This resulted in applying the operation 1 
to the given matrix. This is what happens in general. 

Now consider what these elementary matrices look like. First consider the one which 
involves switching row i and row j where i < j. This matrix is of the form 



/ 1 



°\ 



1/ 
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The two exceptional rows are shown. The i th row was the j th and the j th row was the i th 
in the identity matrix. Now consider what this does to a column vector. 



\ / Vl \ 



V o 



/ vi \ 



1 / V v n J V v n ) 



Now denote by P 1 ^ the elementary matrix which comes from the identity from switching 
rows i and j. From what was just explained consider multiplication on the left by this 
elementary matrix. 

/ an 0,12 ■ • • a ip \ 



V a nX a n2 ■ ■ ■ a np ) 
From the way you multiply matrices this is a matrix which has the indicated columns. 



/ 


( al1 ) 




( ai2 \ 




aii 




a l2 


pij 


a ji 


,P ij 


a j 2 


V 


{ a nl j 




K a„ 2 J 



( ( on \ / a 12 \ ( oi p \ \ 

a j2 



V V a nl ) \ a n2 ) \ a np ) ) 
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a a n2 ■■ a np ) 
This has established the following lemma. 

Lemma 4.1.3 Let P l i denote the elementary matrix which involves switching the i th and 
the j ih rows. Then 

P lj A = B 

where B is obtained from A by switching the i th and the j th rows. 

As a consequence of the above lemma, if you have any permutation (/'i,--- , i n ), it 
follows from Lemma 3.3.2 that the correspondin:', permutation matrix can be obtained by 
multiplying finitely many permutation matrices, each of which switch only two rows. Now 
every such permutation matrix in which only two rows are switched has determinant — 1. 
Therefore, the determinant of the permutation matrix for (i 1; • • • ,i n ) equals (— l) p where 
the given permutation can be obtained by making p switches. Now p is not unique. There 
are many ways to make switches and end up with a given permutation, but what this shows 
is that the total number of switches is either always odd or always even. That is, you could 
not obtain a given permutation by making 2m switches and 2k + 1 switches. A permutation 
is said to be even if p is even and odd if p is odd. This is an interesting result in abstract 
algebra which is obtained very easily from a consideration of elementary matrices and of 
course the theory of the determinant. Also, this shows that the composition of permutations 
corresponds to the product of the correspondm; 1 , permut at ion matrices. 

To see permutations considered more directly in the context of group theory, you should 
see a good abstract algebra book such as [17] or [13]. 

Next consider the row operation which involves multiplying the i th row by a nonzero 
constant, c. The elementary matrix which results from applying this operation to the i th 
row of the identity matrix is of the form 



/ 1 



\ 



V 1 j 

Now consider what this does to a column vector. 

/ 1 \ / Vi \ ( Vy \ 



V lJ\v n J \v n J 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 



Denote by E (c, i) this elementary matrix which multiplies the i row of the identity by the 
nonzero constant, c. Then from what was just discussed and the way matrices are multiplied, 
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equals a matrix havine, the columns indicated below. 



\ a n2 ) 
a-ip \ 



V «m a n2 a, 

This proves the following lemma. 

Lemma 4.1.4 Let E (c, i) denote the elementary matrix corresponding to the row opera- 
tion in which the i th row is multiplied by the nonzero constant, c. Thus E (c, i) involves 
multiplying the i th row of the identity matrix by c. Then 

E{c,i)A = B 

where B is obtained from A by multiplying the i th row of A by c. 

Finally consider the third of these row operations. Denote by£(cxi+ j) the elementary 
matrix which replaces the j th row with itself added to c times the i th row added to it. In 
case i < j this will be of the form 



c ••• 1 

\ 1 / 

Now consider what this does to a column vector. 

(I \ / vi \ / 

1 



V o 
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Now from this and the way matrices are multiplied. 

/ «11 012 



equals a matrix of the following form having the indicated columns. 



V am / 



".;2 
a„ 2 J 



V «n P / 



aj2 + can dj2 + ca i2 



The case where i > j is handled similarly. This proves the following lemma. 

Lemma 4.1.5 Let E (c x i + j) denote the elementary matrix obtained from I by replacing 
the j th row with c times the i th row added to it. Then 

E(cxi+j)A = B 

of A with itself added to c times the 



where B is obtained from A by replacing the j 
i th row of A. 



The next theorem is the n 






Theorem 4.1.6 To perform any of the three row operations on a matrix A it suffices to do 
the row operation on the identity matrix obtaining an elementary matrix E and then take 
the product, EA. Furthermore, each elementary matrix is invertible and its inverse is an 
elementary matrix. 

Proof: The first part of this theorem has been proved in Lemmas 4.1.3 - 4.1.5. It 
only remains to verify the claim about the inverses. Consider first the elementary matrices 
corresponding to row operation of type three. 



E(- 



i+j)E(cxi + j) = I 
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This follows because the first matrix takes c times row i in the identity and adds it to row j. 
When multiplied on the left by E (— c x i + j) it follows from the first part of this theorem 
that you take the i th row of E (c x i + j) which coincides with the i th row of / since that 
row was not changed, multiply it by — c and add to the j th row of E (c x i + j) which was 
the j th row of / added to c times the i th row of I. Thus E (— c x i + j) multiplied on the 
left, undoes the row operation which resulted in E (c x i + j). The same argument applied 
to the product 

E(cxi+j)E(-cxi + j) 

replacing c with — c in the argument yields that this product is also equal to I. Therefore, 
E(cxi + j)- l = E(-cxi + j). 

Similar reasoning shows that for E(c,i) the elementary matrix which comes from mul- 
tiplying the i th row by the nonzero constant, c, 

E{c,i)- l =E{c-\i). 

Finally, consider P iJ ' which involves switching the i th and the j th rows. 

because by the first part of this theorem, multiplying on the left by P^ switches the i th 
and j th rows of P 4 - 7 ' which was obtained from switching the i th and j th rows of the identity. 
First you switch them to get P 1 - 7 and then you multiply on the left by P lJ which switches 
these rows again and restores the identity matrix. Thus (P 8J ) = P 4J . ■ 
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4.2 The Rank Of A Matrix 

Recall the following definition of rank of a matrix. 

Definition 4.2.1 A submatrix of a matrix A is the rectangular array of numbers obtained 
by deleting some rows and columns of A. Let A be an m x n matrix. The determinant 
rank of the matrix equals r where r is the largest number such that some r x r submatrix 
of A has a non zero determinant. The row rank is defined to be the dimension of the span 
of the rows. The column rank is defined to be the dimension of the span of the columns. 
The rank of A is denoted as rank (A) . 

The following theorem is proved in the section on the theory of the determinant and is 
restated here for 
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Theorem 4.2.2 Let A be an m x n matrix. Then the row rank, column rank and determi- 
nant rank are all the same. 

So how do you find the rank? It turns out that row operations are the key to the practical 
computation of the rank of a matrix. 

In rough terms, the following lemma states that linear relationships between columns 
in a matrix are preserved by row operations. 

Lemma 4.2.3 Let B and A be two m x n matrices and suppose B results from a row 
operation applied to A. Then the k th column of B is a linear combination of the i±,- ■ ■ ,i r 
columns of B if and only if the k th column of A is a linear combination of the i\, ■ ■ ■ ,i r 
columns of A. Furthermore, the scalars in the linear combination are the same. (The linear 
relationship between the k th column of A and the «i, • • • ,i r columns of A is the same as the 
linear relationship between the k th column of B and the «i, • • • ,i r columns of B.) 

Proof: Let A equal the following matrix in which the a& are the columns 

(ax a 2 • • • a„ ) 

and let B equal the following matrix in which the columns are given by the b fe 

( bi b 2 ■ ■ • b„ ) 

Then by Theorem 4.1.6 on Page 142 b fc = Ea k where E is an elementary matrix. Suppose 
then that one of the columns of A is a linear combination of some other columns of A. Say 



-* = £* 

res 

Then multiplying by E, 



b fe = Ea k = Y^ c r Ea r = ^ c r b r . 



Corollary 4.2.4 Let A and B be two m x n matrices such that B is obtained by applying 
a row operation to A. Then the two matrices have the same rank. 

Proof: Lemma 4.2.3 says the linear relationships are the same between the columns of 
A and those of B. Therefore, the column rank of the two matrices is the same. ■ 

This suggests that to find the rank of a matrix, one should do row operations until a 
matrix is obtained in which its rank is obvious. 

Example 4.2.5 Find the rank of the following matrix and identify columns whose linear 
combinations yield all the other columns. 

12 13 2 

13 6 2 
3 7 8 6 6 

Take (—1) times the first row and add to the second and then take (—3) times the first 
row and add to the third. This yields 

12 13 
15-3 
15-3 
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By the above corollary, this matrix has the same rank as the first matrix. Now take (—1) 
times the second row and add to the third row yielding 

12 1 3 2 
15-30 


At this point it is clear the rank is 2. This is because every column is in the span of the 
first two and these first two columns are linearly independent. 

Example 4.2.6 Find the rank of the following matrix and identify columns whose linear 
combinations yield all the other columns. 

12 13 2 
12 6 2 
3 6 8 6 6 

Take (—1) times the first row and add to the second and then take (—3) times the first 
w and add to the last row. This yields 

12 1 3 2 
5-30 
5-30 

Now multiply the second row by 1/5 and add 5 times it to the last r 

12 1 3 2 
1 -3/5 


Add (—1) times the second row to the first. 

12 0^2 
1 -3/5 


It is now clear the rank of this matrix is 2 because the first and third columns form a 
basis for the column space. 

The matrix 4.3 is the row reduced echelon form for the matrix 4.2. 

4.3 The Row Reduced Echelon Form 

The following definition is for the row reduced echelon form of a matrix. 

Definition 4.3.1 Let e t denote the column vector which has all zero entries except for the 
i th slot which is one. An mxn matrix is said to be in row reduced echelon form if, in viewing 
successive columns from left to right, the first nonzero column encountered is ei and if you 
have encountered ei,e 2 , • • • , e^, the next column is either e^+i or is a linear combination 
of the vectors, ei, e 2 , • • • , e^. 

For example, here are some matrices which are in row reduced echelon form. 



1 3 3 \ 


/ 1 3 


-11 


15 


,014 


4 


0/ 


I 


si 
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( n matrix. Then A has a row reduced echelon form 



Theorem 4.3.2 Let A be an n 
determined by a simple process. 

Proof: Viewing the columns of A from left to right take the first nonzero column. Pick 
a nonzero entry in this column and switch the row containing this entry with the top row of 
A. Now divide this new top row by the value of this nonzero entry to get a 1 in this position 
and then use row operations to make all entries below this entry equal to zero. Thus the 
first nonzero column is now e x . Denote the resulting matrix by A\. Consider the submatrix 
of A-^ to the right of this column and below the first row. Do exactly the same thing for it 
that was done for A. This time the ei will refer to F m_1 . Use this 1 and row operations 
to zero out every entry above it in the rows of A\ . Call the resulting matrix A 2 . Thus A 2 
satisfies the conditions of the above definition up to the column just encountered. Continue 
this way till every column has been dealt with and the result must be in row reduced echelon 
form. ■ 

The following diagram illustrates the above procedure. Say the matrix looked something 
like the following. 

/ ****** \ 




\0 i 
First step would yield something like 



/ 1 




\0 * * * * */ 
For the second step you look at the lower right corner as described, 



and if the first column consists of all z 
something like this. 



s but the next c 



Thus, after zeroing out the term in the top row above the 1, you get the following for the 
next step in the computation of the row reduced echelon form for the original matrix. 

/ 1 * * * * \ 
1 * * * 

\ * * * / 
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s and to the right of the first 



Next you look at the lower right matrix below the top two r< 
four columns and repeat the process. 

Definition 4.3.3 The first pivot column of A is the first nonzero column of A. The next 
pivot column is the first column after this which is not a linear combination of the columns to 
its left. The third pivot column is the next column after this which is not a linear combination 
of those columns to its left, and so forth. Thus by Lemma J^.2.3 if a pivot column occurs 
as the j column from the left, it follows that in the row reduced echelon form there will be 
one of the e^ as the j th column. 

There are three choices for row operations at each step in the above theorem. A natural 
question is whether the same row reduced echelon matrix always results in the end from 
following the above algorithm applied in any way. The next corollary says this is the case. 



Definition 4.3.4 Two matrices are sc 
the other by a sequence of row operatic 



id to be row equivalent if o 



i be obtained fro 
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Since every row operation can be obtained by multiplication on the left by an elementary 
matrix and since each of these elementary matrices has an inverse which is also an elementary 
matrix, it follows that row equivalence is a similarity relation. Thus one can classify matrices 
according to which similarity class they are in. Later in the book, another more profound 
way of classifying matrices will be presented. 

It has been shown above that every matrix is row equivalent to one which is in row 
reduced echelon form. Note 



so to say two column vectors are equal is to say they are the same linear combination of the 
special vectors e-j. 

Corollary 4.3.5 The row reduced echelon form is unique. That is if B,C are two matrices 
in row reduced echelon form and both are row equivalent to A, then B = C. 

Proof: Suppose B and C are both row reduced echelon forms for the matrix A. Then 
they clearly have the same zero columns since row operations leave zero columns unchanged. 
If B has the sequence ei, e 2 , • • • , e r occurring for the first time in the positions, i\, i 2 , ■ • • ,i r , 
the description of the row reduced echelon form means that each of these columns is not a 
linear combination of the preceding columns. Therefore, by Lemma 4.2.3, the same is true of 
the columns in positions «i, i 2 , ■ ■ ■ , i r for C. It follows from the description of the row reduced 
echelon form, that ei, • • • , e r occur respectively for the first time in columns i\, i 2 , • ■ ■ ,i r 
for C. Thus B, C have the same columns in these positions. By Lemma 4.2.3, the other 
columns in the two matrices are linear combinations, involving the same scalars, of the 
columns in the «i, • • • ,iu position. Thus each column of B is identical to the corresponding 
column in C. ■ 

The above corollary shows that you can determine whether two matrices are row equiv- 
alent by simply checking their row reduced echelon forms. The matrices are row equivalent 
if and only if they have the same row reduced echelon form. 

The following corollary follows. 

Corollary 4.3.6 Let A be an m x n matrix and let R denote the row reduced echelon form 
obtained from A by row operations. Then there exists a sequence of elementary matrices, 
Ei , • • ■ , E p such that 

(E p E p _ 1 ---E 1 )A = R. 

Proof: This follows from the fact that row operations are equivalent to multiplication 
on the left by an elementary matrix. ■ 

Corollary 4.3.7 Let A be an invertible n x n matrix. Then A equals a finite product of 
elementary matrices. 

Proof: Since A^ 1 is given to exist, it follows A must have rank n because by Theorem 
3.3.18 det(A) ^ which says the determinant rank and hence the column rank of A is n 
and so the row reduced echelon form of A is / because the columns of A form a linearly 
independent set. Therefore, by Corollary 4.3.6 there is a sequence of elementary matrices, 
Ei , • • • ,E p such that 

{E p Ep_i---E x )A = L 
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But now multiply on the left on both sides by E 1 then by E p \ and then by E p \ 2 etc. 
until you get 

A = E^ 1 E^ 1 ■ ■ ■ E^E- 1 

and by Theorem 4.1.6 each of these in this product is an elementary matrix. 



Corollary 4.3.8 The rank of a matrix equals the number of nonzero pivot columns. Fur- 
thermore, every column is contained in the span of the pivot columns. 



Proof: Write the row reduced echelon form for the matrix. From Corollary 4.2.4 this 
row reduced matrix has the same rank as the original matrix. Deleting all the zero rows 
and all the columns in the row reduced echelon form which do not correspond to a pivot 
column, yields an r x r identity submatrix in which r is the number of pivot columns. Thus 
the rank is at least r. 

From Lemma 4.2.3 every column of A is a linear combination of the pivot columns since 
this is true by definition for the row reduced echelon form. Therefore, the rank is no more 
than r. ■ 

Here is a fundamental observation related to the above. 

Corollary 4.3.9 Suppose A is anmxn matrix and that m < n. That is, the number of rows 
is less than the number of columns. Then one of the columns of A is a linear combination 
of the preceding columns of A. 

Proof: Since m < n, not all the columns of A can be pivot columns. That is, in the 
row reduced echelon form say e^ occurs for the first time at r* where r\ < r 2 < • • ■ < r p 
where p < m. It follows since m < n, there exists some column in the row reduced echelon 
form which is a linear combination of the preceding columns. By Lemma 4.2.3 the same is 
true of the columns of A. ■ 

Definition 4.3.10 Let A be an iiixn matrix liuriiuj runt:, r. Then the nullity of A is defined 
to ben- r. Also define ker (A) = {x e F" : Ax = 0} . This is also denoted as N (A) . 

Observation 4.3.11 Note that ker (A) is a subspace because ifa,b are scalars and x, y are 
vectors in ker (A), then 

A (ax + by) = aix + bAy = + = 

Recall that the dimension of the column space of a matrix equals its rank and since the 
column space is just A (F n ) , the rank is just the dimension of A (F n ). The next theorem 
shows that the nullity equals the dimension of ker (A) . 

Theorem 4.3.12 Let A be an m x n matrix. Then rank (A) + dim (ker (A)) = n.. 

Proof: Since ker (A) is a subspace, there exists a basis for ker (.4) , {x 1; • ■ ■ , x fe } . Also 

let {Ay i, • ■ ■ , Ayi} be a basis for A (F"). Let u e F™. Then there exist unique scalars a 

such that 

l 

Au = Y, CiAyi 

;=i 
It follows that 
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and so the vector in parenthesis is in her (A). Thus there exist unique bj such that 

l k 

u=J2ayi + J2 b J x i 

«=1 3=1 

Since u was arbitrary, this shows {xi, • • • , Xfc,yi, • • • ,y;} spans F". If these vectors are 
independent, then they will form a basis and the claimed equation will be obtained. Suppose 
then that 

l k 

Apply A to both sides. This yields 

I>Ay, = 
and so each a — 0. Then the independence of the Xj imply each bj = 0. ■ 

4.4 Rank And Existence Of Solutions To Linear Sys- 
tems 

Consider the linear system of equations, 
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where A is an m x n matrix, xisanxl column vector, and b is an m x 1 column vector. 
Suppose 

A = ( a x ■ ■ • a„ ) 

where the afc denote the columns of A. Then x = {x\, ■ ■ ■ , x n ) is a solution of the system 
4.4, if and only if 

xia.1 + • • • + x n a n = b 

which says that b is a vector in span (ai, • • ■ , a n ) . This shows that there exists a solution 
to the system, 4.4 if and only if b is contained in span (a 1; • • • , a„) . In words, there is a 
solution to 4.4 if and only if b is in the column space of A. In terms of rank, the following 
proposition describes the situation. 

Proposition 4.4.1 Let A be an m x n matrix and let b be an m x 1 column vector. Then 
there exists a solution to 4-4 if an d only if 



rank (A | b ) = rank (A) . 
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Proof: Place (A b) and A in row reduced echelon form, respectively B and C. If 

the above condition on rank is true, then both B and C have the same number of nonzero 
rows. In particular, you cannot have a row of the form 

(0 ••• *) 

where ir ^ in B. Therefore, there will exist a solution to the system 4.4. 

Conversely, suppose there exists a solution. This means there cannot be such a row in 
B described above. Therefore, B and C must have the same number of zero rows and so 
they have the same number of nonzero rows. Therefore, the rank of the two matrices in 4.5 
is the same. ■ 

4.5 Fredholm Alternative 

There is a very useful version of Proposition 4.4.1 known as the Fredholm alternative. 
I will only present this for the case of real matrices here. Later a much more elegant and 
general approach is presented which allows for the general case of complex matrices. 
The following definition is used to state the Fredholm alternative. 

Definition 4.5.1 Let S C E m . Then S 1 - = {z G K m : z ■ s = for every s E S} . The funny 
exponent, _L is called "perp". 



Now note 

ker 



(A T ) = {z : A T Z = 0} = J z : jr z k a k = 1 



Lemma 4.5.2 Let A be a real m x n matrix, let x G R n and y G R m . Then 

(Ax • y) = (x-A T y) 

Proof: This follows right away from the definition of the inner product and matrix 
multiplication. 

(Ax - y ) = J2 A kixiy k = E ( AT )ik x ^ = ( x ■ AT y) ■ ■ 

k.l k,l 

Now it is time to state the Fredholm alternative. The first version of this is the following 
theorem. 

Theorem 4.5.3 Let A be a real m x n matrix and let b G K m . There exists a solution, x 
to the equation Ax = h if and only i/bg ker (A T ) . 

Proof: First suppose b G ker (A T ) . Then this says that if A T x = 0, it follows that 
b • x = 0. In other words, taking the transpose, if 

x T A = 0, then x T b = 0. 

Thus, if P is a product of elementary matrices such that PA is in row reduced echelon form, 
then if PA has a row of zeros, in the k th position, then there is also a zero in the k th position 
of Ph. Thus rank (A | b ) = rank (A) .By Proposition 4.4.1, there exists a solution, x 
to the system Ax = h. It remains to go the other direction. 

Let z G ker (A T ) and suppose Ax = b. I need to verify b • z = 0. By Lemma 4.5.2, 

b • z = Ax ■ z = x • A T z = x ■ = ■ 

This implies the following corollary which is also called the Fredholm alternative. The 
"alternative" becomes more clear in this corollary. 
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Corollary 4.5.4 Let A be an m x n matrix. Then A maps R n onto R m if and only if the 
only solution to A T x = is x = 0. 

Proof: If the only solution to A T x = is x = 0, then ker (A T ) = {0} and so ker (A T ) 1 ' = 
R m because every b € R m has the property that b • = 0. Therefore, Ax = b has a solu- 
tion for any b € R m because the b for which there is a solution are those in ker (A T ) by 
Theorem 4.5.3. In other words, A maps K ra onto K m . 

Conversely if A is onto, then by Theorem 4.5.3 every b £ R m is in ker (A T ) and so if 
A T x = 0, then b • x = for every b. In particular, this holds for b = x. Hence if A T x = 0, 
then x = 0. ■ 

Here is an amusing example. 



i. Then A cannot map onto M. m . 
and so in the augmented matrix 



Example 4.5.5 Let A be an m x n matrix in which 
The reason for this is that A T is an n x m where r 
(A T \0) 
there must be some free variables. Thus there exists a nonzero vector x such that A T x = 0. 



4.6 Exercises 



1. Let {ui, • • 
P(ui,--- 



,u„} be vectors in I 
i n ) is defined as 



The parallelepiped determined by these vectors 



P (ui, • • • , u n ) = J J2 ^ u k ■ tk G [0, 1] for all k I . 



i matrix. She 
{.4x: 



7 that 
:GP(ui,. 



■Un)} 



is also a parallelepiped. 

In the context of Problem 1, draw P (ei , e 2 ) where e! , 
for R 2 . Thus ei = (1, 0) , e 2 = (0, 1) . Now suppose 



E = 



1 



where E is the 1 elementary matrix which takes the third row and adds to the first. 
Draw 

{Bic:x6P(ei,ej)}. 
In other words, draw the result of doing E to the vectors in P (e 1? e 2 ). Next draw the 
results of doing the other elementary matrices to P (ei, e 2 ). 

. In the context of Problem 1, either draw or describe the result of doing elementary 
matrices to P (ei, e 2 , e 3 ). Describe geometrically the conclusion of Corollary 4.3.7. 



. Consider a permutation of {1, 2, • ■ ■ , n}. This is an ordered list of numbers taken from 
this list with no repeats, {i\, i 2 , • • • , i n }. Define the permutation matrix P (ii, z 2 , • • • , i n ) 
as the matrix which is obtained from the identity matrix by placing the j th column 
of / as the i 1 - 1 column of P (ii, i 2 , ■ ■ ■ , i n ) ■ Thus the 1 in the if 1 column of this per- 
mutation matrix occurs in the j th slot. What does this permutation matrix do to the 
column vector (1, 2, • • • , n) ? 
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5. fConsider the 3x3 permutation matrices. List all of them and then determine the 
dimension of their span. Recall that you can consider an m x n matrix as something 

in ¥ nm . 

6. Determine which matrices are in row reduced echelon form. 

10 0' 
12 


110 5 

(c) ( 1 2 4 

v 1 3 

7. Row reduce the following matrices to obtain the row reduced echelon form. List the 
pivot columns in the original matrix. 



. Find the rank and nullity of the following matrices. If the rank is r, identify r columns 
in the original matrix which have the property that every other column may be 
written as a linear combination of these. 

10 2 1 2 J ' 

3 2 12 1 6 I 

11 5 2 I 

2 17 3^ 

10 2 10 

3 2 6 5 4 

112 2 2 

2 14 3 2 

10 2 112 



112 2 1 
2 14 3 1 

. Find the rank of the following matrices. If the rank is r, identify r columns in the 
original matrix which have the property that every other column may be written 
as a linear combination of these. Also find a basis for the row and column spaces of 
the matrices. 
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(d) 



1 

4 1 1 



3 2 12 1 6 I 

Oil 5 02: 

2 1 7 3^ 

10 2 10 

3 2 6 5 4 

112 2 2 

2 14 3 2 

10 2 112 

3 2 6 15 1 

112 2 1 

2 14 3 1 



i matrix. Explain why the rank of A is always no larger than 



10. Suppose A is an r. 
min (to, n) . 

11. Suppose A is an to x n matrix in which to < n. Suppose also that the rank of A equals 
to. Show that A maps F n onto F m . Hint: The vectors e 1: • • • ,e m occur as columns 
in the row reduced echelon form for A. 



12. Suppose A is an to x n matrix and that r 
there is no solution to the equation 



1.x 



> n. Show there exists b S F m such that 
= b. 



13. Suppose A is an to x n matrix in which to > n. Suppose also that the rank of A 
equals n. Show that A is one to one. Hint: If not, there exists a vector, x^O such 
that Ax = 0, and this implies at least one column of A is a linear combination of the 
others. Show this would require the column rank to be less than n. 

14. Explain why an n x n matrix A is both one to one and onto if and only if its rank is 



i A is an to x n matrix and {wi, • • ■ , Wfc} is a linearly independent set of 
vectors in A{¥ n ) C F m . Suppose also that Azi = Wj. Show that {zi, • • • ,z fe } is also 
linearly independent. 

16. Show rank (A + B) < rank (A) + rank (B). 

17. Suppose A is an to x n matrix, to > n and the columns of A are independent. Sup- 
pose also that {zi, • • • ,z,t} is a linearly independent set of vectors in F™. Show that 
{Azi, • • • , Azk} is linearly independent. 
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18. Suppose A is an m x n matrix and B is an n x p matrix. Show that 

dim (ker (AB)) < dim (kcr (A)) + dim (kcr (B)) . 

Hint: Consider the subspace, B (F p ) n ker (A) and suppose a basis for this subspace 
is {wi, ■ ■ • , Wfc} . Now suppose {ui, ■ ■ • , u r } is a basis for ker (B) . Let {zi, • • • , zjj 
be such that Bzi = w, and argue that 

ker (AB) C span (ui, • • • , u r , zi, • • • , z&) . 

19. Let m < n and let A be an m x n matrix. Show that A is not one to one. 

20. Let A be an m x n real matrix and let b £ R m . Show there exists a solution, x to the 
system 

A T Ax = A T b 

Next show that if x,x x are two solutions, then Ax = Axi. Hint: First show that 
(A T A) = A T A. Next show if x £ kcr (A T A) , then Ax. = 0. Finally apply the Fred- 
holm alternative. Show A T h £ ker(A T A) ± . This will give existence of a solution. 

21. Show that in the context of Problem 20 that if x is the solution there, then |b — Ax\ < 
|b — Ay\ for every y. Thus Ax is the point of A (R n ) which is closest to b of every 
point in A (R n ). This is a solution to the least squares problem. 

22. fHere is a point in M 4 : (1, 2, 3, 4) J . Find the point in span ' ' 

, 3 
s closest to the given point. 

23. fHere is a point in E 4 : (1, 2, 3, 4) . Find the point on the plane described by x + 2y — 
Az + 4w = which is closest to the given point. 

24. Suppose A, B are two invertible n x n matrices. Show there exists a sequence of row 
operations which when done to A yield B. Hint: Recall that every invertible matrix 
is a product of elementary matrices. 

25. If A is invertible and n x n and B is n x p, show that AB has the same null space as 
B and also the same rank as B. 

26. Here are two matrices in row reduced echelon form 



1 1 \ 




/ 1 


Oil 


. B = 


11 


0/ 




I o o o 



Does there exist a sequence of row operations which when done to A will yield Bl 

Explain. 

27. Is it true that an upper triagular matrix has rank equal to the number of nonzero 
entries down the main diagonal? 
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28. Let {vi, • • • , v n _i} be vectors in F". Describe a systematic way to obtain a vector v n 
which is perpendicular to each of these vectors. Hint: You might consider something 
like this 



\ V( n _i)i U(„_l)2 ' ' 

h entry of the vector Vj. This i; 



\ lot like the c 



29. Let A be an m x n matrix. Then ker (A) is a subspace of F n . Is it true that every 
subspace of F™ is the kernel or null space of some matrix? Prove or disprove. 

30. Let A be an n x n matrix and let P y be the permutation matrix which switches the i th 
and j th rows of the identity. Show that P y AP l i produces a matrix which is similar 
to A which switches the i th and j th entries on the main diagonal. 

31. Recall the procedure for finding the inverse of a matrix on Page 62. It was shown that 
the procedure, when it works, finds the inverse of the matrix. Show that whenever 
the matrix has an inverse, the procedure works. 



a 
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5.1 LU Factorization 

An LU factorization of a matrix involves writing the given matrix as the product of a 
lower triangular matrix which has the main diagonal consisting entirely of ones, L, and an 
upper triangular matrix U in the indicated order. The L goes with "lower" and the U with 
"upper" . It turns out many matrices can be written in this way and when this is possible, 
people get excited about slick ways of solving the system of equations, Ax = y. The method 
lacks generality but is of interest just the same. 

Example 5.1.1 Can you write I ) in the form LU as just described? 

To do so you would need 

Therefore, 6=1 and a = 0. Also, from the bottom rows, xa = 1 which can't happen and 
have a = 0. Therefore, you can't write this matrix in the form LU. It has no LU factorization. 
This is what I mean above by saying the method lacks generality. 

Which matrices have an LU factorization? It turns out it is those whose row reduced 
echelon form can be achieved without switching rows and which only involve row operations 
of type 3 in which row j is replaced with a multiple of row i added to row j for i < j. 
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5.2 Finding An LU Factorization 

There is a convenient procedure for finding an LU factorization. It turns out that it is 
only necessary to keep track of the multipliers which are used to row reduce to upper 
triangular form. This procedure is described in the following examples and is called the 
multiplier method. It is due to Dolittle. 



Example 5.2.1 Find an LU factorization for A = 



1 2 



Write the matrix next to the identity matrix as shown. 
10 
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The process involves doing row operations to the matrix on the right while simultaneously 
updating successive columns of the matrix on the left. First take —2 times the first row and 
add to the second in the matrix on the right. 

1 N 

2 1 
1 / \ 1 

Note the method for updating the matrix on the left. The 2 in the second entry of the first 
column is there because —2 times the first row of A added to the second row of A produced 
a 0. Now replace the third row in the matrix on the right by —1 times the first row added 
to the third. Thus the next step is 

1 



1 








2 
1 


1 





1 


the 


■ lie 


it to. 


1 



1 





1 


-1 


1 



Finally, add the second row to the bottom row and make the following changes 



-11 

At this point, stop because the matrix on the ri- 1, i 1 1 is upper I i iauu'ilar. An LU factorizatic 
is the above. 

The justification for this gimmick will be given later. 



Example 5.2.2 Find an LU factorization for A = 



12 12 1 

2 2 11 

2 3 13 2 

10 112 



This time everything is done at once for a whole column. This saves trouble. First 
multiply the first row by (—1) and then add to the last row. Next take (—2) times the first 
and add to the second and then (—2) times the first and add to the third. 

1 \ / 1 2 1 2 1 

2 10 0-4 -3-1 
2 10 0-1-1-1 

1001/V0-2 -1 1 
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This finishes the first column of L and the first column of U. Now take - 
second row in the matrix on the right and add to the third followed by - 
second added to the last. 



(1/4) times the 
(1/2) times the 



0" 

1 
1/4 1 | | 

1/2 



-1 -1/4 1/4 
1/2 3/2 



This finishes the second column of L as well as the second column of U. Since the matrix 
on the right is upper triangular, stop. The LU factorization has now been obtained. This 
technique is called Dolittle's method. ►► 

This process is entirely typical of the general case. The matrix U is just the first upper 
triangular matrix you come to in your quest for the row reduced echelon form using only 
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the row operation which involves replacing a row by itself added to a multiple of another 
row. The matrix L is what you get by updating the identity matrix as illustrated above. 

You should note that for a square matrix, the number of row operations necessary to 
reduce to LU form is about half the number needed to place the matrix in row reduced 
echelon form. This is why an LU factorization is of interest in solving systems of equations. 



5.3 Solving Linear Systems Using An LU Factorization 

The reason people care about the LU factorization is it allows the quick solution of systems 
of equations. Here is an example. 



12 3 2 

Example 5.3.1 Suppose you want to find the solutions to I 4 3 1 1 

12 3 



V 3 

Of course one way is to write the augmented matrix and grind away. However, this 
involves more row operations than the computation of an LU fact< >) izal ion and il i urns out 
that an LU factorization can give the solution quickly. Here is how. The following is an LU 
factorization for the matrix. 



12 3 2 
4 3 11 
12 3 



1 
4 1 
1 1 



3 2 
-11 -7 
-2 



Let C/x = y and consider Ly = b where in this case, b = (1, 2, 3) . Thus 

1 
4 1 
1 1 



which yields very quickly that y = 
in this case, 



Now you can find x by solving [7x = y. Thus 



-5 -11 -7 
0-2 



Work this out by hand and you will s 
matrices. 



i the advantage of working only with triangulai 
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It may seem like a trivial thing but it is used because it cuts down on the number of 
operations involved in finding a solution to a system of equations enough that it makes a 

difference for large systems. 

5.4 The PLU Factorization 

As indicated above, some matrices don't have an LU factorization. Here is an example. 

2 3 

2 3 

3 1 1 / 

In this case, there is another factorization which is useful called a PLU factorization. Here 
P is a permutation matrix. 

Example 5.4.1 Find a PLU factorization for the above matrix in 5.1. 
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Proceed as before trying to find the row echelon form of the matrix. First add — 1 times 
the first row to the second row and then add —4 times the first to the third. This yields 

10 
1 1 
4 1 

There is no way to do only row operations involving replacing a row with itself added to a 
multiple of another row to the second matrix in such a way as to obtain an upper triangular 
matrix. Therefore, consider M with the bottom two rows switched. 



Now try again with this matrix. First take —1 times the first row and add to the bottom 
row and then take —4 times the first row and add to the second row. This yields 

10 
4 1 
1 1 

The second matrix is upper triangular and so the LU factorization of the matrix M' is 



4 1 
1 1 

= PM = LU where L and U a 



-5 -11 -7 



i above. Therefore, M = P 2 M = PLU and 



3 
1 1 



1 
1 



1 
4 1 



-5 -11 -7 



This process can always be followed and so there always exists a PLU factorization of a 
given matrix even though there isn't always an LU factorization. 



Example 5.4.2 Use a PLU factorization of M = 1 2 3 \ to solve the system 



Afx = b where b = (1,2,3) . 

Let Ux = y and consider PLy — b. In other words, solve, 
1 



1 
1 
1 



4 1 
1 1 



Then multiplying both sides by P gives 

10 
4 1 
1 1 
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Now Ux = y and so it only remains to solve 
2 3 



5.5 Justification For The Multiplier Method 

Why does the multiplier method work for finding an LU factorization? Suppose A is a 
matrix which has the property that the row reduced echelon form for A may be achieved 
using only the row operations which involve replacing a row with itself added to a multiple 
of another row. It is not ever necessary to switch rows. Thus every row which is replaced 
using this row operation in obtaining the echelon form may be modified by using a row 
which is above it. Furthermore, in the multiplier method for finding the LU factorization, 
we zero out the elements below the pivot entry in first column and then the next and so on 
when scanning from the left. In terms of elementary matrices, this means the row operations 
used to reduce A to upper triangular form correspond to multiplication on the left by lower 
triangular matrices having all ones down the main diagonal and the sequence of elementary 
matrices which row reduces A has the property that in scanning the list of elementary 
matrices from the right to the left, this list consists of several matrices which involve only 
changes from the identity in the first column, then several which involve only changes from 
the identity in the second column and so forth. More precisely, E p ■ ■ ■ E\A — U where U 
is upper triangular, each E t is a lower triangular elementary matrix having all ones down 
the main diagonal, for some r t , each of E ri ■ • • E\ differs from the identity only in the first 
column, each of E r2 ■ ■ ■ E ri+ \ differs from the identity only in the second column and so 



forth. Therefore, A = E^ 1 ■ ■■E~\E~ 1 U. You multiply the inverses in the reverse order. 
Now each of the E~ x is also lower triangular with 1 down the main diagonal. Therefore 
their product has this property. Recall also that if Ei equals the identity matrix except 
for having an a in the j th column somewhere below the main diagonal, E^ 1 is obtained by 
replacing the a in Ei with —a, thus explaining why we replace with —1 times the multiplier 
in computing L. In the case where A is a 3 x m matrix, E^ 1 ■ ■ ■ E^^E^ 1 is of the form 

10 0\/lOOWlOO\ / 1 
a 1 1 I ( 1 ) = ( a 1 

1 / \ b 1 

Note that scanning from left to right, the first two in the product involve changes in the 
identity only in the first column while in the third matrix, the change is only in the second. 
If the entries in the first column had been zeroed out in a different order, the following 
would have resulted. 

10 0\/l0 0Wl0 0\ / 1 
1 I [ o 1 I ( 1 ) = [ a 1 

b 1 
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However, it is important to be working from the left to the right, one column at a time. 

A similar observation holds in any dimension. Multiplying the elementary matrices which 
involve a change only in the j th column you obtain A equal to an upper triangular, n x m 
matrix U which is multiplied by a sequence of lower triangular matrices on its left which is 
of the following form, in which the a^ are negatives of multipliers used in row reducing to 
an upper triangular matrix. 



1 







" 1 L 



" 



V Ol,n-l • • • 1 ) \ a 2 .n-2 ■■■ : 

From the matrix multiplication, this product equals 
/ 1 



/ 1 



V o 



o \ 
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Notice how the end result of the matrix multiplication made no change in the a,ij. It just 
rilled in the empty spaces with the a^ which occurred in one of the matrices in the product. 
This is why, in computing L, it is sufficient to begin with the left column and work column 
by column toward the right, replacing entries with the negative of the multiplier used in the 
row operation which produces a zero in that entry. 

5.6 Existence For The PLU Factorization 

Here I will consider an invertible n x n matrix and show that such a matrix always has 
a PLU factorization. More general matrices could also be considered but this is all I will 
present. 

Let A be such an invertible matrix and consider the first column of A. If An ^ 0, use 
this to zero out everything below it. The entry An is called the pivot. Thus in this case 
there is a lower triangular matrix L\ which has all ones on the diagonal such that 



>=(; ;J 



L X P 1 A=\ „ , (5.2) 
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Here Pi = I. In case An = 0, let r be such that A r \ ^ and r is the first entry for which 
this happens. In this case, let P\ be the permutation matrix which switches the first row 
and the r th row. Then as before, there exists a lower triangular matrix L\ which has all 
ones on the diagonal such that 5.2 holds in this case also. In the first column, this L\ has 
zeros between the first row and the r th row. 

Go to A\. Following the same procedure as above, there exists a lower triangular matrix 
and permutation matrix L 2 , P 2 such that 



4^ = (; I) 



Then using block multiplication, Theorem 3.5.2, 
[o L> 2 ) { PI, j 



^ o z/ 2 / ^ o f^i y ^ o i' 2 p^! y 







= L 2 P 2 L 1 P 1 A 



and L 2 bas all the subdiagonal entries equal to except possibly some nonzero entries in 
the second column starting with position r 2 where P 2 switches rows r 2 and 2. Continuing 
Shis way. ii Follows there are lower triangular matrices Lj having all ones down the diagonal 
and permutation matrices Pi which switch only two rows such that 

j L„_ 1 P„_iL„_ 2 P„_ 2j L„_ 3 • • • L 2 P 2 LiPiA = U (5.3) 

where U is upper triangular. The matrix Lj has all zeros below the main diagonal except 
for the j th column and even in this column it has zeros between position j and Tj where Pj 
switches rows j and rj . Of course in the case where no switching is necessary, you could get 
all nonzero entries below the main diagonal in the j th column for Lj. 

The fact that Lj is the identity except for the j th column means that each Pk for k > j 
almost commutes with Lj. Say Pk switches the k th and the q th rows for q > k > j. When 
you place Pk on the right of Lj it just switches the k th and the q th columns and leaves the 
j th column unchanged. Therefore, the same result as placing Pk on the left of Lj can be 
obtained by placing Pk on the right of Lj and modifying Lj by switching the k th and the q th 
entries in the j th column. (Note this could possibly interchange a for something nonzero.) 
It follows from 5.3 there exists P, the product of permutation matrices, P — P n -i ■ ■ ■ P\ 
each of which switches two rows, and L a lower triangular matrix having all ones on the 
main diagonal, L — L' n _ 1 ■ ■ ■ L' 2 L' X , where the Lj are obtained as just described by moving a 
succession of Pk from the left to the right of Lj and modifying the j th column as indicated, 
such that 

LP A = U. 

Then 

A = P T L- l U 
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It is customary to write this more simply as 

A = PLU 

where L is an upper triangular matrix having all ones on the diagonal and P is a permutation 
matrix consisting of Pi ■ ■ ■ P n -i as described above. This proves the following theorem. 

Theorem 5.6.1 Let A be any invertible n x n matrix. Then their exists a permutation 
matrix P and a lower triangular matrix L having all ones on the main diagonal and an 
upper triangular matrix U such that 



5.7 The QR Factorization 

As pointed out above, the LU factorization is not a mathematically respectable thing be- 
cause it does not always exist. There is another factorization which does always exist. 
Much more can be said about it than I will say here. At this time, I will only deal with real 
matrices and so the inner product will be the usual real dot product. 

Definition 5.7.1 An n x n real matrix Q is called an orthogonal matrix if 



Thus an orthogonal matrix is one whose inverse is equal to its transpose. 
First note that if a matrix is orthogonal this says 

Y i QlQih = Y i Q i iQik = s ik 

Thus 

\Q*\ 2 = E (I) Qm) = E E E QuX.Qu.Xr 
- E E E QisQirXsXr = E E E QisQirXsXr 

= EE^^ = E^-w 2 

This shows that orthogonal transformations preserve distances. You can show that if you 
have a matrix which does preserve distances, then it must be orthogonal also. 

Example 5.7.2 One of the most important examples of an orthogonal matrix is the so 
called Householder matrix. You have v a unit vector and you form the matrix 



This is an orthogonal matrix which is also symmetric. To s 
matrix operations. 

(/-2vv T ) T = 7 T -(2vv T ) T 
= I - 2vv T 
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so it is symmetric. Now to show it is orthogonal, 

(I - 2vv T ) (j - 2vv T ) = I - 2vv T - 2vv T + 4vv T vv T 
= / - 4vv T + 4vv T = / 

because v T v = v • v = |v|" = 1. Therefore, this is an example of an orthogonal matrix. 

Consider the following problem. 

Problem 5.7.3 Given two vectors x,y such that |x| = |y| ^ but x/y and you want an 
orthogonal matrix Q such that Qx = y and Qy — x. The thing which works is the House- 
holder matrix 

x-y 



Here is why this works. 



Q = I-2, ^(x-y) J 

|x-y| 



(x-y)-2, ^(x-y) J (x-y) 

|x-y| 

(x-y)-2^^|x-y| 2 = y-x 



= (x + y)-2- ' ( x -y) J (x + y) 

|x-y| 

= (x + y)-2-^ 5 ((x-y).(x + y)) 

|x-y| 

= (x + y)-2^^(|x| 2 -|y| 2 )=x + 
|x-y| v ' 



Qx + Qy = x + y 
Qx-Qy = y - x 

Adding these equations, 2Qx = 2y and subtracting them yields 2Qy = 2x. 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 

A picture of the geometric significance follow: 



Some Factorizations 



The orthogonal matrix Q reflects across the dotted line taking x to y and y to x. 

Definition 5.7.4 Let A be anmxn matrix. Then a QR factorization of A consists of two 
matrices, Q orthogonal and R upper triangular (right triangular) having all the entries on 
the main diagonal nonnegative such that A — QR. 
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With the solution to this simple problem, here is how to obtain a QR factorization for 
any matrix A. Let 

A= (ai,a 2 ,--- ,a„) 
where the a^ are the columns. If ai = 0, let Qi = /. If ai ^ 0, let 

/ l a il \ 
b = 



V o ) 



and fori) i l lie Householder matrix 



Q lS /-2^=iO(a 1 -bf 

|ai - b| 



As in the above problem Q\&\ = b and so 
QxA=[ 



|ai| * 
A 2 



where A 2 is a m— 1 x n— 1 matrix. Now find in the same way as was just do] u 
matrix Q 2 such that 



A 3 



Q 2 A 2 

*■(:*)■ 



-*-(:* 



Continuing this way until the result is upper triangular, you get a sequence of orthogonal 
matrices Q p Q p -i • ■ ■ Qi such that 

QpQ p -i ■ ■ ■ QxA = R (5.4) 

where R is upper triangular. 

Now if Qi and Q 2 are orthogonal, then from properties of matrix multiplication, 

Q1Q2 (QiQ 2 ) T = Q1Q2QIQI = QxIQl = 1 

and similarly 

(QiQ2) T QiQ2 = L 

Thus the product of orthogonal matrices is orthogonal. Also the transpose of an orthogonal 
matrix is orthogonal directly from the definition. Therefore, from 5.4 

A=(Q p Q p _ 1 ---Q 1 ) T R=QR. 

This proves the following theorem. 
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Theorem 5.7.5 Let A be any real m x n matrix. Then there exists an orthogonal matrix 
Q and an upper triangular matrix R having nonnegative entries on the main diagonal such 
that 

A = QR 



and this factorization can be accomplished in a systematic manner. 



5.8 Exercises 

1. Find a LU factorizntio 



2. Find a LU factorization of 1 3 2 1 



3. Find a PLU factorization of 1 2 2 



4. Find a PLU factorization of 2 4 2 4 1 



5. Find a PLU factorization of 



1 2 



2 4 1 

3 2 1 



6. Is there only one LU factorization for a given matrix? Hint: Consider the equation 



Vo i ) v i i A° o/' 

7. Here is a matrix and an LU factorization of it. 

1 2 5 \ / 1 \ / 1 2 5 
114 9 = 1 1 00-1-1 9 
0125/ \ -1 1 / \ 1 14 

Use this factorization to solve the system of equations 



8. Find a QR factorization for the matrix 
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9. Find a QR factorization for the matrix 



12 10 
3 11 
10 2 1 



10. If you had a QR factorization, A = QR, describe how you could use it to solve the 
equation Ax = b. 



11. If Q is an 
that for 



nlhogonal matrix, show the columns a 

Q = ( qi • • • q« ) 



a ort honorinal set. That is show 



it follows that q^ • q^ — % . Also show that any orthonormal set of vectors is linearly 
independent. 

12. Show you can't expect uniqueness fur ()R factorizations. Consider 

0' 
1 
1 



|V2 ±72 
|V2 -W2 



0^ 





1 
1 
1 




1 
1 



Using Definition 5.7.4, can it be concluded that if A is a 
follow there is only one QR factorization? 

, Suppose {ai, • • • , a„} are linearly independent vectors in E 

A = ( a! • ■ • a„ ) 

Form a QR factorization for A. 

/r u 

( ai • • ■ a„ ) = ( qi ■ ■ • q„ ) 



Ti ilile matrix it ■* 



•i„ \ 







r nn ) 



Show that for each k < n, 

span(ai,- • • , a fc ) = span(qi, • • • , q fe ) 

Prove that every subspace of R™ has an orthonormal basis. The procedure just de- 
scribed is similar to the Gram Schmidt procedure which will be presented later. 

14. Suppose Q n R n converges to an orthogonal matrix Q where Q n is orthogonal and R n 
is upper triangular having all positive entries on the diagonal. Show that then Q n 
converges to Q and R n converges to the identity. 
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Linear Programming 



6.1 Simple Geometric Considerations 

One of the most important uses of row operations is in solving linear program problems 
which involve maximizing a linear function subject to inequality constraints determined 
from linear equations. Here is an example. A certain hamburger store has 9000 hamburger 
patties to use in one week and a limitless supply of special sauce, lettuce, tomatoes, onions, 
and buns. They sell two types of hamburgers, the big stack and the basic burger. It has also 
been determined that the employees cannot prepare more than 9000 of either type in one 
week. The big stack, popular with the teenagers from the local high school, involves two 
patties, lots of delicious sauce, condiments galore, and a divider between the two patties. 
The basic burger, very popular with children, involves only one patty and some pickles 
and ketchup. Demand for the basic burger is twice what it is for the big stack. What 
is the maximum number of hamburgers which could be sold in one week given the above 
limitations? 



Stafford 

1# associates 




bookboon.com/blog/subsites/stafford 
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Let x be the number of basic burgers and y the number of big stacks which could be sold 
in a week. Thus it is desired to maximize z — x + y subject to the above constraints. The 
total number of patties is 9000 and so the number of patty used is x + 2y. This number must 
satisfy x + 2y < 9000 because there are only 9000 patty available. Because of the limitation 
on the number the employees can prepare and the demand, it follows 2x + y < 9000. 
You never sell a negative number of hamburgers and so x, y > 0. In simpler terms the 
problem reduces to maximizing z = x + y subject to the two constraints, x + 2y < 9000 and 
2x + y < 9000. This problem is pretty easy to solve geometrically. Consider the following 
picture in which R labels the region described by the above inequalities and the line z — x+y 
is shown for a particular value of z. 




As you make z larger this line moves away from the origin, always having the same slope 
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and the desired solution would consist of a point in the region, R which makes z as large as 
possible or equivalently one for which the line is as far as possible from the origin. Clearly 
this point is the point of intersection of the two lines, (3000, 3000) and so the maximum 
value of the given function is 6000. Of course this type of procedure is fine for a situation in 
which there are only two variables but what about a similar problem in which there are very 
many variables. In reality, this hamburger store makes many more types of burgers than 
those two and there are many considerations other than demand and available patty. Each 
will likely give you a constraint which must be considered in order to solve a more realistic 
problem and the end result will likely be a problem in many dimensions, probably many 
more than three so your ability to draw a picture will get you nowhere for such a problem. 
Another method is needed. This method is the topic of this section. I will illustrate with 
this particular problem. Let X\ = x and y — x 2 - Also let x 3 and x± be nonnegative variables 
such that 

Xi + 2x 2 + x 3 = 9000, 2xi +x 2 + Xi = 9000. 

To say that x 3 and X4 are nonnegative is the same as saying x\ +2x 2 < 9000 and 2zi +x 2 < 
9000 and these variables are called slack variables at this point. They are called this because 
they "take up the slack". I will discuss these more later. First a general situation is 
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6.2 The Simplex Tableau 



Here is some notation. 

Definition 6.2.1 Let x,y be vectors in R q . Then x < y means for each i, Xi < yi. 

The problem is as follows: 

Let A be an m x (m + n) real matrix of rank m. It is desired to find x G W l ~ 
that x satisfies the constraints, 

x > 0, Ax = b 

and out of all such x, 
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is as large (or small) as possible. This is usually referred to as maximizing o 
subject to the above constraints. First I will consider the constraints. 

Let A = ( ai • • • a n+m ) . First you find a vector, x°> 0, Ax°= b such that n of 
the components of this vector equal 0. Letting i\, ■ ■ ■ ,i n be the positions of x° for which 
x° — 0, suppose also that {a^, • • • , a Jm } is linearly independent for ji the other positions 
of x°. Geometrically, this means that x° is a corner of the feasible region, those x which 
satisfy the constraints. This is called a basic feasible solution. Also define 

CB = (CjV- ,c jm ), c F = (c %1 ,--- ,a n ) 
xb = {x h ,--- ,x jm ), x F = (x n ,--- ,x in ). 



z (x°) = ( c B c F ) ( *fi j = c B x% 



since x^ = 0. The variables which are the components of the vector x^ are called the basic 
variables and the variables which are the entries of x F are called the free variables. You 
set Xf = 0. Now (x°, z°) is a solution to 

U?)(*)=(o) 

along with the constraints x > 0. Writing the above in augmented matrix form yields 
( A b \ 



1 



(6.2) 



Permute the columns and variables on the left if necessary to write the above in the form 



= (o) 



-C, ? 1 I S I = ( n ) ^ 



or equivalently in the augmented matrix form keeping track of the variables on the bottom 

B F b \ 
-c B -cf 1 . (6.4) 

x B x P / 

Here B pertains to the variables Xi 1 , • • • , Xj m and is an m x m matrix with linearly inde- 
pendent columns, {&j 1 , ■ • • , a Jm } , and F is an m x n matrix. Now it is assumed that 

and since B is assumed to have rank m, it follows 

x B = B- l h > 0. (6.5) 

This is very important to observe. f? _1 b > 0! This is by the assumption that x° > 0. 
Do row operations on the top part of the matrix 

f B F b\ 

{-c B -cp 1 ) ^ 
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and obtain its row reduced echelon form. Then after these row operations the above becomes 
( I B- X F B~ x h \ 



-c B -c F 



(6.7) 



where B x h > 0. Next do another row operation in order to get a where you see a — cb- 
Thus 

( I B ~ lF ° B ~ lh \ (an) 

(I B- X F B- X h \ 

\ c B B- l F' - c F 1 c B x B ) 

(I B- X F B-i-b \ 

I c B B- l F-c F 1 z° ) 



(6.9) 



The reason there is a z° on the bottom right corner is that x F = and (x^, x° Fl z°) is a 
solution of the system of equations represented by the above augmented matrix because it is 
a solution to the system of equations corresponding to the system of equations represented 
by 6.6 and row operations leave solution sets unchanged. Note how attractive this is. The z 
is the value of z at the point x°. The augmented matrix of 6.9 is called the simplex tableau 
and it is the beginning point for the simplex algorithm to be described a little later. It is 
very convenient to express the simplex tableau in the above form in which the variables are 

possibly permuted in order to have I J on the left side. However, as far as the simplex 

algorithm is concerned it is not necessary to be permuting the variables in this manner. 
Starting with 6.9 you could permute the variables and columns to obtain an augmented 
matrix in which the variables are in their original order. What is really required for the 
simplex tableau? 

It is an augmented m+lxm+n+2 matrix which represents a system of equations 
which has the same set of solutions, (x,z) as the system whose augmented matrix is 

/ A b \ 

I c i o ) 

(Possibly the variables for x are taken in another order.) There are m linearly independent 
columns in the first m + n columns for which there is only one nonzero entry, a 1 in one of 
the first m rows, the "simple columns" , the other first m + n columns being the "nonsimple 
columns". As in the above, the variables corresponding to the simple columns are xb, 
the basic variables and those corresponding to the nonsimple columns are x F , the free 
variables. Also, the top m entries of the last column on the right are nonnegative. This is 
the description of a simplex tableau. 

In a simplex tableau it is easy to spot a basic feasible solution. You can see one quickly 
by setting the variables, x F corresponding to the nonsimple columns equal to zero. Then 
the other variables, corresponding to the simple columns are each equal to a nonnegative 
entry in the far right column. Lets call this an "obvious basic feasible solution" . If a 
solution is obtained by setting the variables corresponding to the nonsimple columns equal 
to zero and the variables corresponding to the simple columns equal to zero this will be 
referred to as an "obvious" solution. Lets also call the first m + n entries in the bottom 
row the "bottom left row" . In a simplex tableau, the entry in the bottom right corner gives 
the value of the variable being maximized or minimized when the obvious basic feasible 
solution is chosen. 

The following is a special case of the general theory presented above and shows how such 
a special case can be fit into the above framework. The following example is rather typical 
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of the sorts of problems considered. It involves inequality constraints instead of Ax = b. 
This is handled by adding in "slack variables" as explained below. 

The idea is to obtain an augmented matrix for the constraints such that obvious solutions 
are also feasible. Then there is an algorithm, to be presented later, which takes you from 
one obvious feasible solution to another until you obtain the maximum. 

Example 6.2.2 Consider z = x\ — x 2 subject to the constraints. ;T'i+2.'i'2 < 10, xi + 2:E2 > 2, 
and 2x\ + x 2 < 6, Xi > 0. Find a simplex tableau for a problem of the form x > 0,Ax = b 
which is equivalent to the above problem. 

You add in slack variables. These are positive variables, one for each of the first three con- 
straints, which change the first three inequalities into equations. Thus the first three inequal- 
ities become xi + 2x 2 +x 3 = 10, a;i+2.T 2 — X4 = 2, and 2xi+x 2 +x$ — 6, Xi, £2, £3, £4,0:5 > 0. 
Now it is necessary to find a basic feasible solution. You mainly need to find a positive so- 
lution to the equations, 

x x + 2x 2 + x 3 = 10 

xi + 2x 2 — £4 = 2 

2x\ + x 2 + x 5 = 6 
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the solution set for the above system is given by 



An easy way to get a basic feasible solution is to let 24 — 8 and £5 = 1. Then a feasible 
solution is 

(xi, X2, £3, £4, £5) = (0, 5, 0, 8, 1) . 

It follows 2 = 
the bottom is 



-c 1 



/ 1 2 1 10 \ 

1 2 0-1002 

2 10 10 6 
-110 10 

\ x\ x 2 x 3 24 x 5 / 
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and the first thing to do is to permute the colui 
will have x\ and x 3 at the end. 



\ x 2 x A x 5 Xi 



so that the list of 


1 10 \ 


2 




6 




1 




x 3 J 





riables on tho bottom 



Next, as described above, take the row reduced echelon form of the top three lines of the 
above matrix. This yields 



Now do row operations to 



Moo 


i 


1 


f) 


v 1 


:; 


1 


i_ 


1 


f) 


1 


^ 


1 


-1 


1 


I 


1 


n 


1 


il 





2 :i 
2 



to finally obtain 



and this is a simplex tableau. The variables are x 2 , 24, £5, xi, x 3 , z. 

It isn't as hard as it may appear from the above. Lets not permute the variables and 
simply find an acceptable simplex tableau as described above. 



Example 6.2.3 Consider z = x\ — x 2 subject to the constraints, x\-\-2x 2 < 10, x 
and 2x\ + x 2 < 6, Xi > 0. Find a simplex tableau. 



2x 2 > 2, 
augmented matrix which is descriptive of the constraints 



1 2 1 

1 2 

2 1 



10 
-10 6 
1 6 



The obvious solution is not feasible because of that -1 in the fourth column. When you let 
x\, x 2 = 0, you end up having X4 = —6 which is negative. Consider the second column and 
select the 2 as a pivot to zero out that which is above and below the 2. 

1 1 

1 1 -J 
I i 1 

This one is good. When you let x\ = X4 = 0, you find that x 2 ~ 3,x 3 — 4, x§ — 3. The 
obvious solution is now feasible. You can now assemble the simplex tableau. The first step 
is to include a column and row for z. This yields 



-10 1 10 
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Now you need to get zeros in the right places so the simple columns will be preserved as 
simple columns in this larger matrix. This means you need to zero out the 1 in the third 
column on the bottom. A simplex tableau is now 






1 


L 


i 


1 - 


5 ° 


1 


i 


I i o 


-1 


0- 


■1 1 



Note it is not the same one obtained earlier. There is no reason a simplex tableau should 
be unique. In fact, it follows from the above general description that you have one for each 
basic feasible point of the region determined by the constraints. 

6.3 The Simplex Algorithm 

6.3.1 Maximums 

The simplex algorithm takes you from one basic feasible solution to another while maxi- 
mizing or minimizing the function you are trying to maximize or minimize. Algebraically, 
it takes you from one simplex tableau to another in which the lower right corner either 
increases in the case of maximization or decreases in the case of minimization. 

I will continue writing the simplex i ableau in such a way that the simple columns having 
only one entry nonzero are on the left. As explained above, this amounts to permuting the 
variables. I will do this because it is possible to describe what is going on without onerous 
notation. However, in the examples, I won't worry so much about it. Thus, from a basic 
feasible solution, a simplex tableau of the following form has been obtained in which the 
columns for the basic variables, x B are listed first and b > 0. 

Let xl = bi for * = 1, • • • , m and x° = for i > m. Then (x°, z°) is a solution to the above 
system and since b > 0, it follows (x°, z°) is a basic feasible solution. 

If at < for some i, and if Fa < so that a whole column of f ) is < with the 

bottom entry < 0, then letting Xi be the variable corresponding to that column, you could 
leave all the other entries of x F equal to zero but change x t to be positive. Let the new 
vector be denoted by x^, and letting x^ = b — Fx.' F it follows 



(**)* - &k~X>iw(x 



Now this shows (x' B ,x! F ) is feasible whenever Xi > and so you could let Xi become 
arbitrarily large and positive and conclude there is no maximum for z because 

z = (-a) Xl + z° (6.11) 

If this happens in a simplex tableau, you can say there is no maximum and stop. 

What if c > 0? Then z — z° — cx F and to satisfy the constraints, you need x F > 0. 
Therefore, in this case, z° is the largest possible value of z and so the maximum has been 
found. You stop when this occurs. Next I explain what to do if neither of the above stopping 
conditions hold. 
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The only case which remains is that some a < and some Fji > 0. You pick a column 

i which a < 0, usually the one for which a is the largest in absolute value. 

You pick Fji > as a pivot element, divide the j th row by Fji and then use to obtain 
zeros above Fji and below Fji, thus obtaining a new simple column. This row operation 
also makes exactly one of the other simple columns into a nonsimple column. (In terms of 
variables, it is said that a free variable becomes a basic variable and a basic variable becomes 
a free variable.) Now permuting the columns and variables, yields 

(I F' b' \ 



where z 0/ > z° because z 0/ = z° — a f -^- J and a < 0. If b' > 0, you are in the same 
position you were at the beginning but now z° is larger. Now here is the important thing. 
You don't pick just any Fji when you do these row operations. You pick the positive one 
for which the row operation results in b' > 0. Otherwise the obvious basic feasible 
solution obtained by letting x' F = will fail to satisfy the constraint that x > 0. 
How is this done? You need 

b' k = b k - -^ > (6.12) 



i or equivalently, 



(<i.LS) 



Now if F kl < the above holds. Therefore, you only need to check F pl for F pl > 0. The 
pivot, Fji is the one which makes the quotients of the form 
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for all positive F pi the smallest. This will work because for F^i > 0, 
b i L< K^ b > Fkibp 



Having gotten a new simplex tableau, you do the same thing to it which was just done 
and continue. As long as b > 0, so you don't encounter the degenerate case, the values 
for z associated with setting xp = keep getting strictly larger every time the process is 
repeated. You keep going until you find c > 0. Then you stop. You are at a maximum. 
Problems can occur in the process in the so called degenerate case when at some stage of 
the process some bj =0. In this case you can cycle through different values for x with no 
improvement in z. This case will not be discussed here. 



Example 6.3.1 Maximize 2x\ 
6, x\ + 2x2 < 6, x\,X2 > 0. 



3^2 subject to the constraints X\ + x 2 > 1, 2x± + x 2 < 
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The constraints are of the form 



x\ + x 2 - x 3 
1x\ + X2 + X4 



where the £3, £4, £5 a 

(lie f'orni 



1 the slack variables. An augmented matrix for these equations is of 



11-10 1 
2 1 10 6 



X3 < 0. We need to exchange 



12 16 

Obviously the obvious solution is not feasible. It results i: 
basic variables. Lets just try something. 

1-10 
-12 10 
1 10 1 

Now this one is all right because the obvious solution is feasible. Letting x 2 = x?, — 0, 
it follows that the obvious solution is feasible. Now we add in the objective function as 
described above. 

1-10 1 

-1 2 10 4 



-30001 
Then do row operations to leave the simple columns the same. Then 

1 1 -10 1 
0-1 2 10 4 
1 10 10 5 
0-1-20012 

Now there are negative numbers on the bottom row to the left of the 1. Lets pick the first. 
(It would be more sensible to pick the second.) The ratios to look at are 5/1, 1/1 so pick for 
the pivot the 1 in the second column and first row. This will leave the right column above 
the lower right corner nonnegative. Thus the next tableau is 

1 1-10 1 
1 1 10 5 
-10 2 10 4 
1 0-30013 

There is still a negative number there to the left of the 1 in the bottom row. The new ratios 
are 4/2, 5/1 so the new pivot is the 2 in the third column. Thus the next tableau is 



§ 1 -| 3 

-10 2 1 4 
-\ § 19 

Still, there is a negative number in the bottom row to the left of the 1 so the process does 
not stop yet. The ratios are 3/ (3/2) and 3/ (1/2) and so the new pivot is that 3/2 in the 
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first column. Thus the new tableau is 

10—| | 

§00 1 -§ 

2 I | 

o | I 1 

Now stop. The maximum value is 10. This is an easy enough problem to do geometrically 
and so you can easily verify that this is the right answer. It occurs when x^ = x§ = 0, x\ = 
2, £2 = 2,x 3 = 3. 

6.3.2 Minimums 

How does it differ if you are findm;; a minimum? From a basic feasible solution, a simplex 
tableau of the following form has been obtained in which the simple columns for the basic 
variables, xg are listed first and b > 0. 



V c 1 z° J v "-^ 

Let xl = bi for i = 1, • • • , m and x° = for i > to. Then (x°, z°) is a solution to the above 
system and since b > 0, it follows (x°,z°) is a basic feasible solution. So far, there is no 
change. 

Suppose first that some Ci > and Fj t < for each j. Then let x^ consist of changing x t 
by making it positive but leaving the other entries of x^ equal to 0. Then from the bottom 



and you let x' B = b — Fx' F > 0. Thus the constraints continue to hold w 
increasingly positive and it follows from the above equation that there is n 
z. You stop when this happens. 

Next suppose c < 0. Then in this case, z = z° — cx F and from the constraints, x F > 
and so —cxp > and so z° is the minimum value and you stop since this is what you are 
looking for. 

What do you do in the case where some a > and some Fji > 0? In this case, you use 
the simplex algorithm as in the case of maximums to obtain a new simplex tableau in which 
z ' is smaller. You choose Fji the same way to be the positive entry of the i th column such 
that b p /Fpi > bj/Fji for all positive entries, F pi and do the same row operations. Now this 

As in the case of maximums no problem can occur and the process will converge unless 
you have the degenerate case in which some bj = 0. As in the earlier case, this is most 
unfortunate when it occurs. You see what happens of course. z° does not change and the 
algorithm just delivers different values of the variables forever with no improvement. 

To summarize the geometrical significance of the simplex algorithm, it takes you from one 
corner of the feasible region to another. You go in one direction to find the maximum and 
in another to find the minimum. For the maximum you try to get rid of negative entries of c 
and for minimums you try to eliminate positive entries of c, where the method of elimination 
involves the auspicious use of an appropriate pivot element and row operations. 

Now return to Example 6.2.2. It will be modified to be a maximization problem. 
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Example 6.3.2 Maximize z = x\ — x 2 subject to the constraints, 

xx + 2x 2 < 10, xi +2x 2 > 2, 
and 2xi + x 2 < 6,a; 4 > 0. 

Recall this is the same as maximizing z = x\ — x 2 subject to 
\ 



1 2 1 



I) 



12 0-10 
2 10 1 



■'■:( 



\X 5 J 



Id 



the variables, X3,X4,Xs being slack variables. Recall the simplex tableau w 



8 



1 

1 

1 

0-§-ll-5 



1 



with the variables ordered as x 2 , .t 4 , a; 5 , x\, x 3 and so x^ = (x 2 , X4, x$) and 

x F = (xi,x 3 ) . 

Apply the simplex algorithm to the fourth column because — | < and this is the most 
negative entry in the bottom row. The pivot is 3/2 because 1/(3/2) = 2/3 < 5/ (1/2) . 
Dividing this row by 3/2 and then using this to zero out the other elements in that column, 
the new simplex tableau is 

1 -1 § 

10 10 

I 1 — § 

1 0-11 

Now there is still a negative number in the bottom left row. Therefore, the process should 
be continued. This time the pivot is the 2/3 in the top of the column. Dividing the top row 
by 2/3 and then using this to zero out the entries below it, 



-\ 1 7 

1 1 1 
I 10 3 
I 13 
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Now all the numbers on the bottom left row are nonnegative so the process stops. Now 
recall the variables and columns were ordered as x 2 ,X4 : ,x 5 ,Xi,x 3 . The solution in terms of 
x\ and x 2 is x 2 — and x\ — 3 and z — 3. Note that in the above, I did not worry about 
permuting the columns to keep those which go with the basic variables on the left. 
Here is a bucolic example. 

Example 6.3.3 Consider the following table. 





Fi 


F 2 


F, 


Fi 


iron 


1 


2 


1 


S 


protein 


5 


3 


2 


1 


folic acid 


1 


2 


2 


1 


copper 


2 


1 


1 


1 


calcium 


1 


1 


1 


1 



This information is available to a pig farmer and Fi denotes a particular feed. The numbers 
in the table contain the number of units of a particular nutrient contained in one pound of 
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the given feed. Thus F% has 2 units of iron in one pound. Now suppose the cost of each feed 
in cents per pound is given in the following table. 



F\ 


F 2 


F 3 


Fa 


> 


3 


2 


3 



A typical pig needs 5 units of iron, 8 of protein, 6 of folic acid, 7 of copper and 4 of calcium. 
(The units may change from nutrient to nutrient.) How many pounds of each feed per pig 
should the pig farmer use in order to minimize his cost? 



His problem is to 



C = 2x\ + 3x 2 + 2x 3 + 3x4 subject to the constraints 
Xi + 2x 2 + x 3 + 3x 4 > 5, 



5xi + 3x2 + 2x 3 + x 4 





X\ + 2x 2 + 2x 3 + X4 > 


2xi + x 2 + X3 + x 4 > 


X\ + x 2 + x 3 + x 4 > 


where each Xj > 0. Add in the slack variables, 


xi + 2x 2 + x 3 + 3x 4 - x 5 


5xi + 3x 2 + 2x 3 + X4 - x 6 


Xi + 2X2 + 2x 3 + X4 — X7 


2xi + X 2 + X 3 + X4 - x 8 


X! + X 2 + X 3 + X 4 - Xg 


The augmented matrix for this system is 


/ 1 2 1 3 -1 




5 3 2 1 -1 




12 2 10 0-1 




2 1110 - 




I 1 1 1 1 



5\ 



-1 4/ 



How in the world can you find a basic feasible solution? Remember the simplex algorithm 
is designed to keep the entries in the right column nonnegative so you use this algorithm a 
few times till the obvious solution is a basic feasible solution. 

Consider the first column. The pivot is the 5. Using the row operations described in the 

aliM ail hm, you get 



/ 



v° 



I I 



I 1 

! I 
i 3 



-i 



o o 







o -I 



(I 



o -l o a 







-1 



Now go to the second column. The pivot in this column 
row than the pivot in the first column so I will use it to zero o 
will get rid of the zeros in the fifth column and introduce zeros 

( 1 f 2 -f i 

L f -I 



the 7/5. This is in a different 

■ out everything below it. This 

the second. This yields 



1 i 

1 

o o I 

o I 



1 







I 



-10 
0-10 
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Now consider another column, this time the fourth. I will pick this one because it has 
some negative numbers in it so there are fewer entries to check in looking for a pivot. 
Unfortunately, the pivot is the top 2 and I don't want to pivot on this because it would 
destroy the zeros in the second column. Consider the fifth column. It is also not a good 
choice because the pivot is the second element from the top and this would destroy the zeros 
in the first column. Consider the sixth column. I can use either of the two bottom entries 
as the pivot. The matrix is 



/ 1 



-10 









I 



I) 



I) 



1 \ 



-10 



\ 3 2 1 ( 
Next consider the third column. The pivot is the 1 
/ 1 2 -10 
10 1 1 
1-2 
0-1 



-13 
-7 10 / 
the third row. This yields 
1 1 \ 



-1 









-1 



-2 2 
1 
1 







-1 1 



There are still 5 columns which consist entirely of : 
have that entry equal to 1 but one still has a -1 i 



-7 7/ 

s except for o 

it, the -1 being 



i entry. Four of them 
q the fourth column. 



I need to do the row operations on a nonsimple column which has the pivot in the fourth 



. The pivot is the 3. The new matrix is 



, 



" 



(0.15) 



row. Such a column is the second to the 1, 

/ 1 

1 

1 



\ 

Now the obvious basic solution is feasible. You let X4 = = x 5 = x 7 — x 8 and x\ — 
8/3, X2 — 2/3,0:3 — 1, and xq — 28/3. You don't need to worry too much about this. It is 
the above matrix which is desired. Now you can assemble the simplex tableau and begin 



-1 






-1 1 



/ 



the algorithm. Remember C = 
deal with C. This yields 

/ 



2xi + 3^2 + 2x 3 + 3x4. First add the row and column which 



.1 



I) 







-1 i I 
00 I -§ 

10-1000 



1 \ 







i± -1 



-I f 



10/ 



Now you do row operations to keep the simple columns of 6.15 simple in 6.16. Of course 
you could permute the columns if you wanted but this is not necessary. 

This yields the following for a simplex tableau. Now it is a matter of getting rid of the 
positive entries in the bottom row because you are trying to minimize. 



( ° 


1 





1 














1 

















() 


V° 









-1 




\ \ \ 

I -loof 

-1 1 
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Tlic most positive of them is the 2/3 and so I will apply the algorithm to this 
pivot is the 7/3. After doing the row operation the next tableau is 



\o -i 



1 



1 






-i -§ -J 

f 1 i -f 

-I o -f -f 



f \ 

0^ 

° T 

1 f 
f 

1 f / 



and you see that all the entries are negat 
xi = 18/7, xi = 0,2:3 = 11/7,2:4 = 2/7. 



is 64/7 and it 



There is 
and attempt to 
Recall it is 



for the above problem. However, I will pretend I don't km 
the simplex algorithm. You set up the simiplex tableau the san 



\ (his 



/ 1 

1 

1 









7? \ ? -t i 

a 3 i ? 

- -10-- —- 









1 



1 



§ \ 
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Now to maximize, you try to get rid of the negative entries in the bottom left row. The 
most negative entry is the -1 in the fifth column. The pivot is the 1 in the third row of this 
column. The new tableau is 



( ° 


1 


1 


5 ° 





_2 


i 





i) 


1 








1 o 





i' 


J2 





It 








1 


-2 1 





-1 


()" 





i) 











-\ ° 


n 


_1 


_1 


i 


1) 








1 


1 o 


1 


_'i 


_i 








V o 





1 


-i o 





t 

3 


_? 





1 
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Consider the fourth column. The pivot is the top 1/3. The new tableau i; 



3 


3 


1 





1) 


_2 


1 


ii 


5 "\ 


-1 


-1 











1 


-1 





1 


G 


7 





1 


1) 


— 5 


2 





11 


1 


1 


II 





1) 


-1 


1 





2 


—5 


-4 


(i 


ii 


1 


3 


-4 





2 


4 


■5 


i) 








-4 


1 


1 


17 ) 



( ° 


_1 


i 1 o 


1 


1 


| 





-I 


I 1 





_i 


-| 





_i 


-I 


V ° 


_ 3 


-I o 



There is still a negative in the bottom, the -4. The pivot in that column is the 3. The 
algorithm yields 

3 -| 

) i 

) -f 

) -| 1 

1 -| 

3 -f 1 „ 

Note how z keeps getting larger. Consider the column having the —13/3 in it. The pivot is 
the single positive entry, 1/3. The next tableau is 

/ 5 3210-10000 8 \ 

3 2100-10100 1 
14 7 5 1 -3 19 

4 2100-10010 4 
4 1000-11000 2 
13 6 4 -3 1 24 / 

There is a column consisting of all negative entries. There is therefore, ] 
also how there is no way to pick the pivot in that column. 

Example 6.3.4 Minimize z — x\ — 3^2 + X3 subject to the constraints x\ + xi + x% < 
10, x\ + X2 + X3 > 2, x\ + xi + 3^3 < 8 and x\ + Ixi + x% < 7 with all variables nonnegative. 

There exists an answer because the region defined by the constraints is closed and 
bounded. Adding in slack variables you get the following augmented matrix corresponding 
to the constraints. 

'llll 10 N 
1110-100 
113 10 
12 10 1 

Of course there is a problem with the obvious solution obtained by setting to zero all 
variables corresponding to a nonsimple column because of the simple column which has the 
— 1 in it. Therefore, I will use the simplex algorithm to make this column non simple. The 
third column has the 1 in the second row as the pivot so I will use this column. This yields 







1 1 8 
10-1002 
3 10 2 
1 15 



(0.17) 
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and the obvious solution is feasible. Now it is time to assemble the simplex tableau. First 
add in the bottom row and second to last column corresponding to the equation for z. This 
yields 

110 

1 10-1000 
2-2 3 10 

1 10 10 



Next you need to zero out the entries in the bottom 
columns in 6.17. This yields the simplex tableau 



10/ 

which ai'i 



below one of the simple 



V o 



-2 
1 
4 



1 8 \ 

-10 2 

3 10 2 

10 10 5 



The desire is to minimize this so you need to get rid of the positive entries in the left bottom 
row. There is only one such entry, the 4. In that column the pivot is the 1 in the second 
row of this column. Thus the next tableau is 

8 \ 



^00 

1 1 


1 

1 


1 

-1 









2 


1 


1 


-1 


-1 


2 


1 


^-4 


-4 


3 


1 



There is still a positive number there, the 3. The pivot in this column is the 2. Apply the 
algorithm again. This yields 



1 



1 

1 





Now all the entries in the left bottom row are nonpositive so the process has stopped. The 
minimum is -21/2. It occurs when an = 0, x 2 = 7/2, x 3 = 0. 

Now consider the same problem but change the word, minimize to the word, maximize. 

Example 6.3.5 Maximize z = X\ — 3x 2 + x 3 subject to the constraints x x + x 2 + x 3 < 
10, x\ + x 2 + x 3 > 2, x\ + x 2 + 3x3 < 8 and x\ + 2x 2 + x 3 < 7 with all variables nonnegative. 

The first part of it is the same. You wind up with the same simplex tableau, 



1 


1 


8 \ 


1 


-1 


2 





3 


10 2 





1 


10 5 





-1 


12; 



Download free eBooks at bookboon.com 



Linear Algebra I Matrices and Row operations 



Linear Programming 



but this time, you apply the algorithm to get rid of the negative entries in the left bottom 
row. There is a — 1. Use this column. The pivot is the 3. The next tableau is 



3 / 



10- 


| 


1 


; 


1 


\ 


0- 


\ 1 





I ° i 



There is still a negative entry, the —2/3. This \" 
the 2/3 on the fourth row. This yields 



( o 


-i 


1 


- 


-1 





-\ 


1 


\ 


"5 ° 





i 


1 





1 


1 


:> 





—\ 


f o 


V° 


■ F ) 





if 


1 1 



1 be the new pivot column. The pivot is 



7 / 



and the process stops. The maximum for z is 7 and it occurs when x\ — 13/2, x 2 — 0, x 3 - 
1/2. 



6.4 Finding A Basic Feasible Solution 

By now it should be fairly clear that finding a basic feasible solution can create considerable 
difficulty. Indeed, given a system of linear inequalities along with the requirement that each 
variable be nonnegative, do there even exist points satisfying all these inequalities? If you 
have many variables, you can't answer this by drawing a picture. Is there some other way 
to do this which is more systematic than what was presented above? The answer is yes. It 
is called the method of artificial variables. I will illustrate this method with an example. 



Example 6.4.1 Find a basic ft 
2,x 1 +x 2 + x 3 <7 and x > 0. 



ble solution to the system 2x±+x 2 — x 3 > 3, xi+x 2 +x 3 > 

If you write the appropriate augmented matrix with the slack variables, 

-1 3 \ 
— 1 2 I (6.18) 



1 1 
1 1 







The obvious solution is not feasible. This is why it would be hard to get started with 
the simplex method. What is the problem? It is those —1 entries in the fourth and fifth 
columns. To get around this, you add in artificial variables to get an augmented matrix of 
the form 

1-1-1 10 

11 0-10012] (6.19) 

110 10 
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Thus the variables are xi. x-2, ;t';s, x\, 15. io- x-, x%. Suppose you can find a feasible solution 
to the system of equations represented by the above augmented matrix. Thus all variables 
are nonnegative. Suppose also that it can be done in such a way that x$ and X7 happen to 
be 0. Then it will follow that x\, ■ ■ ■ , xq is a feasible solution for 6.18. Conversely, if you can 
find a feasible solution for 6.18, then letting x 7 and x$ both equal zero, you have obtained a 
feasible solution to 6.19. Since all variables are nonnegative, x 7 and x$ both equalling zero 
is equivalent to saying the minimum of z = x 7 + xg subject to the constraints represented by 
the above augmented matrix equals zero. This has proved the following simple observation. 

Observation 6.4.2 There exists a feasible solution to the constraints represented by the 
augmented matrix of 6.18 and x > if and only if the minimum of x, 7 + x s subject to the 
constraints of 6.19 and x > exists and equals 0. 

Of course a similar observation would hold in other similar situations. Now the point of 
all this is that it is trivial to see a feasible solution to 6.19, namely x e = 7, x 7 — 3,x s = 2 
and all the other variables may be set to equal zero. Therefore, it is easy to find an initial 
simplex tableau for the minimization problem just described. First add the column and row 
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2 1 -1 

1 1 1 

1 1 1 





10 7 
0-1-110 



Next it is necessary to make the last two columns on the bottom left row into simple columns. 
Performing the row operation, this yields an initial simplex tableau, 

2 1-1-1 10 3 
111 0-100102 
111 010007 
32 -1-100015 



Now the algorithm involves getting rid of the positive entries on the left bottom 



vith the first column. The pivot ii 
the new tableau 

I 
T 
i 7 



the 2. An application of the simplex algorithi 



-1 -£ 1 



-1 -| 1 



Now go to the third column. The pivot is the 3/2 ii 
simplex algorithm yields 



. Begin 

l yields 



. An application of the 



o I 1 


1 


_2 





_T 2 








1 


1 


-1 








(I 


i) 


-1 -1 



1 



(6.20) 



and you see there are only nonpositive numbers on the bottom left column so the process 
stops and yields for the minimum of z = x^ + xg. As for the other variables, x\ = 5/3, X2 — 
0,x 3 = 1/3, £4 = 0,x 5 = 0,x e — 5. Now as explained in the above observation, this is a 
basic feasible solution for the original system 6.18. 

Now consider a maximization problem associated with the above constraints. 

Example 6.4.3 Maximize x\ — X2 + 2x3 subject to the constraints, 2x\ + X2 — X3 > 3, x\ + 
x 2 + ^3 ^ 2, xi + X2 + x 3 < 7 and x > 0. 

From 6.20 you can immediately assemble an initial simplex tableau. You begin with the 
first 6 columns and top 3 rows in 6.20. Then add in the column and row for z. This yields 

1 I -i -1 I 
I 1 1 -I I 

110 5 
-11-2 10 

and you first do row operations to make the first and third columns simple columns. Thus 
the next simplex tableau is 



1 1 


-i 


-s 1 


I 1 


i' 


-I 1 





ii 


110 5 


I 


" 


-foil 
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You are trying to get rid of negative entries in the bottom left r 
—5/3. The pivot is the 1. The next simplex tableau is then 

1 | -| | f 

0-1 - 0-0 — 

u 3 i 3 o 3 u 3 

110 5 



and so the maximum value of z is 32/3 and it occurs when x± = 10/3, x 2 = and X3 — 11/3. 

6.5 Duality 

You can solve minimization problems by solving maximization problems. You can also go 
the other direction and solve maximization problems by minimization problems. Sometimes 
this makes things much easier. To be more specific, the two problems to be considered are 

A.) Minimize z = cx subject to x > and Ax > b and 

B.) Maximize w = yb such that y > and yA < c, 

(equivalently A T y T > c T and w = b T y T ) . 

In these problems it is assumed A is an m x p matrix. 

I will show how a solution of the first yields a solution of the second and then show how 
a solution of the second yields a solution of the first. The problems, A.) and B.) are called 
dual problems. 

Lemma 6.5.1 Let x be a solution of the inequalities of A.) and let y be a solution of the 
inequalities of B.). Then 

ex > yb. 

and if equality holds in the above, then x is the solution to A.) and y is a solution to B.). 

Proof: This follows immediately. Since c > yA, ex > yAx > yb. 

It follows from this lemma that if y satisfies the inequalities of B.) and x satisfies the 
inequalities of A.) then if equality holds in the above lemma, it must be that x is a solution 
of A.) and y is a solution of B.). ■ 

Now recall that to solve either of these problems using the simplex method, you first 
add in slack variables. Denote by x' and y' the enlarged list of variables. Thus x' has at 
least m entries and so does y' and the inequalities involving A were replaced by equalities 
whose augmented matrices were of the form 

(A -I b ) , and ( A T I c T ) 

Then you included the row and column for z and w to obtain 

/ A -I b\ , / A T I c T \ 

{ -c 1 J and { -b* 1 J • (6 - 21) 

Then the problems have basic feasible solutions if it is possible to permute the first p + m 
columns in the above two matrices and obtain matrices of the form 

where B, B\ are invertible m x m and p x p matrices and dcnol in;', 1 he variables associated 
with these columns by xs,ys and those variables associated with F or F\ by xp and y.p, 
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it follows that letting Bxlb = b and xj? = 0, the resulting vector, x' is a solution to x' > 
and ( A —I ) x' = b with similar constraints holding for y'. In other words, it is possible 
to obtain simplex tableaus, 

// B-*F B-ib \ (I B^F X B^c T \ 

Similar considerations apply to the second problem. Thus as just described, a basic feasible 
solution is one which determines a simplex tableau like the above in which you get a feasible 
solution by setting all but the first m variables equal to zero. The simplex algorithm takes 
you from one basic feasible solution to another till eventually, if there is no degeneracy, you 
obtain a basic feasible solution which yields the solution of the problem of interest. 

Theorem 6.5.2 Suppose there exists a solution x to A.) where x is a basic feasible solution 
of the inequalities of A.). Then there exists a solution y to B.) and ex — by. It is also 
possible to find y from x using a simple formula. 

Proof: Since the solution to A.) is basic and feasible, there exists a simplex tableau like 
6.23 such that x' can be split into x B and x F such that x F = and x B = B~ x h. Now since 
it is a minimizer, it follows cbB~ 1 F — cp < and the minimum value for ex is cs.B _1 b. 
Stating this again, ex = csB~ 1 b. Is it possible you can take y = c B B _1 ? From Lemma 6.5.1 
this will be so if cbB~ x solves the constraints of problem B.). Is cbB~ x > 0? Is cbB~ 1 A < 
c? These two conditions are satisfied if and only if c B B~ x ( A — I ) < ( c ). Referring 
to the process of permuting the columns of the first augmented matrix of 6.21 to get 6.22 
and doing the same permutations on the columns of ( A —I) and ( c ) , the desired 
inequality holds if and only if cbB^ 1 ( B F ) < ( cb cp ) which is equivalent to saying 
( cb cbB~ 1 F ) < ( cb cj? ) and this is true because cbB~ 1 F — cp < due to the 
assumption that x is a minimizer. The simple formula is just y = CgB^ 1 . ■ 

The proof of the following corollary is similar. 

Corollary 6.5.3 Suppose there exists a solution, y to B.) where y is a basic feasible solution 
of the inequalities of B.). Then there exists a solution, x to A.) and ex = by. It is also 
possible to find x from y using a simple formula. In this case, and referring to 6.23, the 
simple formula is x = B^ T b Bl ■ 
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As an example, consider the pig farmers problem. The main difficulty in this problem 
was finding an initial simplex tableau. Now consider the following example and marvel at 
how all the difficulties d 



Example 6.5.4 



:e C = 2x\ + 3x 2 + 2x 3 + 3x4 subject to the constraints 



2x 2 + £3 + 3x 4 


> 


5. 


3x 2 + 2a; 3 + x 4 


> 


8. 


2x 2 + 2a; 3 + a; 4 


> 


G. 


+ x 2 +x 3 + X4, 


> 


7. 


+ X 2 +X 3 + Xi 


> 


4. 



Here the dual problem is to 



= 5j/i + 8y 2 + 6y 3 + 7y 4 + 4y 5 subject to the 
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constraints 



1 1 

1 1 



( Vi \ 



3 1111, 

V V5 ) 

Adding in slack variables, these inequalities are equivalent to the system of equations whose 
augmented matrix is 

5 12 110 2" 

3 2 110 10 

2 2 110 10 

11110 1 

Now the obvious solution is feasible so there is no hunting for an initial obvious feasible 
solution required. Now add in the row and column for w. This yields 

/ 1 5 1 2 1100002\ 

2 3 2 1 10 10 
1 2 2 1 10 10 

3 1 1 1 10 10 
\ -5 -8 -6 -7 -4 1 / 

It is a maximization problem so you want to eliminate the negatives in the bottom left row. 
Pick the column having the one which is most negative, the —8. The pivot is the top 5. 
Then apply the simplex algorithm to obtain 

0" 
10 
10 
10 
1 



There are still negative entries in the bottom left row. Do the simplex algorithm to the 
column which has the — 4? . The pivot is the | . This yields 



1 



1 













1 
1 



and there are still negative numbers. Pick the column which has the —13/4. The pivot ii 
the 3/8 in the top. This yields 



which has only o 



1 A 



1 ° I 

o o I 

o o I 









e entry on the bottom left. The pivot for this first column is the 
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. The next tableau is 



( ° 




t 







1 
(1 
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(1 
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I o 
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II 

1) 



7 1 I 


u - 

1 - 


I I 


II 
II 
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2 7 7 2 

I 1 


f 
- 
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' _i 





1 




1 18 7 


7 5 


£ 



and all the entries in the left bottom row are nonnegative so the answer is 64/7. This is 
the same as obtained before. So what values for x are needed? Here the basic variables are 
j/i, 2/3, 2/4, y-j. Consider the original augmented matrix, one step before the simplex tableau. 

/ 1 5 1 
2 3 2 



The matrix B is 



1 1 

1 
1 
1 

4 




1 




1) 


i) 



1 





2 \ 

3 

2 

1 3 

10/ 


ociatec 
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th these basic 


1 
1 
1 
1 

-4 


1 








'1 

(l 
1 

ii 


2 \ 
3 

2 

1 3 

10/ 


2 


\ 






1 
1 


) 







Also b% = ( 5 6 7 ) and so from Corollary 6.5.3, 



:i 



which agrees with the original way of doing the problem. 

Two good books which give more discussion of linear programming are Strang [25] and 
Nobel and Daniels [20]. Also listed in these books are other references which may prove 
useful if you are interested in seeing more on these topics. There is a great deal more which 
can be said about linear programming. 

6.6 Exercises 

1. Maximize and minimize z = X\ — 2x 2 + x 3 subject to the constraints x\ + x 2 + x 3 < 
10, x\ + x 2 + x 3 > 2, and x\ + 2x 2 + x 3 < 7 if possible. All variables are nonnegative. 
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2. Maximize and minimize the following if possible. All variables are nonnegative. 

(a) z = x\ — 2x 2 subject to the constraint s x j + x 2 -:- x->, < 10, xi + X2 + X3 > 1, and 
xi + 2x 2 + x 3 < 7 

(b) z — x\ — 2x 2 — 3^3 subject to the constraints x\ + X2 + X3 < 8, x\ + x 2 +3^3 > 1, 
and x\ + x 2 + x 3 < 7 

(c) z = 2x\ + x 2 subject to the constraiuis .r L — x 2 + x->, < 10, .ti + x 2 + xj, > 1, and 
xi + 2x 2 + x 3 < 7 

(d) z = xi + 2x 2 subject to the constraints x± — x 2 + x 3 < 10, X\ + x 2 + x 3 > 1, and 
xi + 2x 2 + £3 < 7 

3. Consider contradictory constraints, x\ + x 2 > 12 and Xi + 2x 2 < 5,xi > 0,x 2 > 0. 
You know these two contradict but show they contradict using the simplex algorithm. 

4. Find a solution to the following inequalities for x, y > if it is possible to do so. If it 
is not possible, prove it is not possible. 
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6x1 +4x 3 < 11 

(b) 5xi + 4x 2 + 4x 3 > 8 
6x1 +6X2 + 5X3 < 11 

6x1 +4x 3 < 11 

(c) 5xi + 4x 2 + 4x 3 > 9 
6x1 + 6x2 + 5x3 < 9 
xi - x 2 + x 3 < 2 

(d) xi + 2x 2 > 4 
3xi + 2x 3 < 7 
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5xi ^ 2x 2 + 4x 3 < 1 

(e) 6xi — 3x 2 + 5x 3 > 2 

5xi - 2x 2 + 4x 3 < 5 

5. Minimize z = x\ + x 2 subject to x\ + x 2 > 2, x x + 3x 2 < 20, x x + x 2 < 18. Change 
to a maximization problem and solve as follows: Let y^ = M — #,-. Formulate in terms 
of 2/1,2/2. 
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